Method

Monocular 3D object Detection with Multi-plane Inverse Perspective Mapping [MonoMIP]
[Anonymous Submission]

Submitted on 9 Mar. 2026 13:42 by
[Anonymous Submission]

Running time:0.01 s
Environment:1 core @ 2.5 Ghz (C/C++)

Method Description:
MonoMIP is a monocular 3D object detector that constructs BEV-
aligned features via multi-plane inverse perspective mapping and
predicts 2D and 3D object properties jointly in image and BEV
space. By projecting image features across multiple height planes
instead of relying on a single ground-plane assumption, it builds a
richer BEV representation for monocular 3D detection.
Parameters:
none
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 93.75 % 88.75 % 78.92 %
Car (Orientation) 93.60 % 88.13 % 78.16 %
Car (3D Detection) 28.21 % 19.87 % 16.91 %
Car (Bird's Eye View) 36.86 % 25.40 % 22.12 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker