Method

Hybrid Multi-level Fusion for Online Occlusion-aware Monocular 3D Object Detection [MonoHMOO]


Submitted on 2 Apr. 2021 06:50 by
He Liu (Tsinghua University)

Running time:0.2 s
Environment:1 core @ 2.5 Ghz (Python + C/C++)

Method Description:
We propose a deep hybrid multi-level fusion
architecture for monocular 3D object detection,
with an additionally designed online occlusion-
aware optimization process. We integrate the
monocular 3D features with the pseudo-LiDAR filter
generation network between hybrid multi-level
layers, which utilizes the inherent multi-scale
and promotes the depth and semantic information
flow in different stages.
Parameters:
We run stochastic gradient descent (SGD) optimizer
with a momentum of 0.9 and a weight decay of
0.0005. The iteration number for the training
process is set to 40,000 using the ``poly''
learning rate policy and set the base learning
rate to 0.01 and power to 0.9. Because we find our
method remains convergence and can continue
lifting the performance, we add the extra 10,000
iteration number whose learning rate is not
adjusted. Finally, the ResNet50 takes about one
day with batch size 8 in training.
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 92.33 % 78.21 % 61.58 %
Car (Orientation) 91.51 % 75.95 % 59.55 %
Car (3D Detection) 20.28 % 13.12 % 9.56 %
Car (Bird's Eye View) 27.39 % 17.60 % 13.25 %
Pedestrian (Detection) 49.26 % 34.74 % 30.37 %
Pedestrian (Orientation) 38.13 % 26.28 % 22.91 %
Pedestrian (3D Detection) 7.62 % 5.23 % 4.28 %
Pedestrian (Bird's Eye View) 8.69 % 5.62 % 5.25 %
Cyclist (Detection) 37.41 % 23.59 % 21.20 %
Cyclist (Orientation) 23.82 % 15.24 % 13.84 %
Cyclist (3D Detection) 1.87 % 1.60 % 1.66 %
Cyclist (Bird's Eye View) 1.91 % 1.65 % 1.75 %
This table as LaTeX


2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot



2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot



2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot




eXTReMe Tracker