Method

MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection [MonoDETR]
https://github.com/ZrrSkywalker/MonoDETR.git

Submitted on 4 Apr. 2022 10:26 by
fu kexue (fudan)

Running time:0.04 s
Environment:1 core @ 2.5 Ghz (Python)

Method Description:
We introduce a simple framework for Monocular
DEtection with depth-aware TRansformer, named
MonoDETR. We enable the vanilla transformer to be
depth-aware and enforce the whole detection process
guided by depth. Specifically, we represent 3D
object candidates as a set of queries and produce
non-local depth embeddings of the input image by a
lightweight depth predictor and an attention-based
depth encoder. Then, we propose a depth-aware
decoder to conduct both inter-query and query-scene
depth feature communication. In this way, each
object estimates its 3D attributes adaptively from
the depth-informative regions on the image, not
limited by center-around features. With minimal
handcrafted designs, MonoDETR is an end-to-end
framework without additional data, anchors or NMS
and achieves competitive performance on KITTI
benchmark among state-of-the-art center-based
networks.
Parameters:
ResNEt50 + 3*Encoder + 3*Decoder
Latex Bibtex:
@article{zhang2022monodetr,
title={MonoDETR: Depth-aware Transformer for
Monocular 3D Object Detection},
author={Zhang, Renrui and Qiu, Han and Wang, Tai
and Xu, Xuanzhuo and Guo, Ziyu and Qiao, Yu and Gao,
Peng and Li, Hongsheng},
journal={arXiv preprint arXiv:2203.13310},
year={2022}
}

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 93.99 % 86.17 % 76.19 %
Car (Orientation) 93.78 % 85.44 % 75.29 %
Car (3D Detection) 24.52 % 16.26 % 13.93 %
Car (Bird's Eye View) 32.20 % 21.45 % 18.68 %
This table as LaTeX


2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot




eXTReMe Tracker