Method

MonoSC: Enhancing Monocular 3D Object Detection by 2D Segmentation and Completion [MonoSC]
[Anonymous Submission]

Submitted on 30 Apr. 2025 08:42 by
[Anonymous Submission]

Running time:0.2 s
Environment:1 core @ 2.5 Ghz (C/C++)

Method Description:
MonoSC leverages the accurate 2D predictions of
existing detectors as priors, uses SAM (Segment
Anything Model) to segment and a generative
network to complete those incomplete or blurred
prior objects. After enhancing those objects'
visibility and clarity, their feature
representations can be more robust, accurate, and
beneficial for the 3D detection task. Thus, we
design a three-stage framework that includes a
SAM-driven segmentation module, a generative
completion module, and an object-of-interest
detection module, which are progressively used for
the full pipeline.
Parameters:
lr=0.00125
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 88.86 % 81.52 % 70.96 %
Car (Orientation) 88.48 % 80.83 % 70.12 %
Car (3D Detection) 24.57 % 18.03 % 15.80 %
Car (Bird's Eye View) 32.93 % 24.74 % 21.38 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker