\begin{tabular}{c | c | c | c | c | c | c}
{\bf Method} & {\bf Setting} & {\bf Moderate} & {\bf Easy} & {\bf Hard} & {\bf Runtime} & {\bf Environment}\\ \hline
MMLab PV-RCNN & la & 94.57 \% & 98.15 \% & 91.85 \% & 0.08 s / 1 core & S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. CVPR 2020.\\
MVRA + I-FRCNN+ & & 94.46 \% & 95.66 \% & 81.74 \% & 0.18 s / GPU & H. Choi, H. Kang and Y. Hyun: Multi-View Reprojection Architecture for Orientation Estimation. The IEEE International Conference on Computer Vision (ICCV) Workshops 2019.\\
CPRCCNN & & 94.24 \% & 96.31 \% & 89.71 \% & 0.1 s / 1 core & \\
EPNet & & 94.22 \% & 96.13 \% & 89.68 \% & 0.1 s / 1 core & \\
D3D & & 94.18 \% & 95.22 \% & 89.14 \% & 0.02 s / 1 core & \\
Patches - EMP & la & 93.58 \% & 97.88 \% & 90.31 \% & 0.5 s / GPU & J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.\\
OAP & & 93.35 \% & 96.56 \% & 85.69 \% & 0.06 s / 1 core & \\
CLOCs\_PointCas & & 93.34 \% & 96.66 \% & 85.87 \% & 0.1 s / GPU & \\
Deep MANTA & & 93.31 \% & 98.83 \% & 82.95 \% & 0.7 s / GPU & F. Chabot, M. Chaouch, J. Rabarisoa, C. Teulière and T. Chateau: Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image. CVPR 2017.\\
Associate-3Ddet\_v2 & & 93.24 \% & 96.58 \% & 88.04 \% & 0.04 s / 1 core & \\
AIMC-RUC & & 93.14 \% & 96.64 \% & 87.92 \% & 0.08 s / 1 core & \\
ELE & & 93.07 \% & 98.42 \% & 90.17 \% & 0.1 s / GPU & \\
RGB3D & la & 92.94 \% & 96.52 \% & 87.83 \% & 0.39 s / GPU & \\
MVX-Net++ & & 92.93 \% & 96.16 \% & 87.69 \% & 0.15 s / 1 core & \\
PC-RGNN & & 92.91 \% & 96.54 \% & 87.67 \% & 0.1 s / GPU & \\
AAMF-SSD & & 92.88 \% & 96.44 \% & 87.67 \% & 0.05 s / GPU & \\
FLID & & 92.77 \% & 95.64 \% & 85.00 \% & 0.04 s / GPU & \\
OHS & & 92.74 \% & 96.20 \% & 89.68 \% & 0.04 s / 1 core & \\
PointRCNN-deprecated & la & 92.74 \% & 96.70 \% & 85.51 \% & 0.1 s / GPU & \\
RGB-SSD & & 92.73 \% & 96.38 \% & 87.46 \% & 0.1 s / 1 core & \\
IGRP & & 92.66 \% & 96.27 \% & 87.63 \% & 0.18 s / 1 core & \\
& & 92.58 \% & 96.08 \% & 89.60 \% & / & \\
SARPNET & & 92.58 \% & 95.82 \% & 87.33 \% & 0.05 s / 1 core & Y. Ye, H. Chen, C. Zhang, X. Hao and Z. Zhang: SARPNET: Shape Attention Regional Proposal Network for LiDAR-based 3D Object Detection. Neurocomputing 2019.\\
Patches & la & 92.57 \% & 96.31 \% & 87.41 \% & 0.15 s / GPU & J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.\\
R-GCN & & 92.53 \% & 96.16 \% & 87.45 \% & 0.16 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
PPFNet & & 92.52 \% & 96.30 \% & 87.44 \% & 0.1 s / 1 core & \\
PI-RCNN & & 92.52 \% & 96.15 \% & 87.47 \% & 0.1 s / 1 core & L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai and X. He: PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module. AAAI 2020 : The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020.\\
Discrete-PointDet & & 92.48 \% & 95.89 \% & 87.08 \% & 0.02 s / 1 core & \\
PointPainting & la & 92.43 \% & 98.36 \% & 89.49 \% & 0.4 s / GPU & S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object Detection. CVPR 2020.\\
3D IoU-Net & & 92.42 \% & 96.31 \% & 87.60 \% & 0.1 s / 1 core & J. Li, S. Luo, Z. Zhu, H. Dai, S. Krylov, Y. Ding and L. Shao: 3D IoU-Net: IoU Guided 3D Object Detector for Point Clouds. arXiv preprint arXiv:2004.04962 2020.\\
CLOCs\_SecCas & & 92.37 \% & 95.16 \% & 88.43 \% & 0.1 s / 1 core & \\
& & 92.32 \% & 95.83 \% & 89.39 \% & / & \\
OneCoLab SicNet & & 92.17 \% & 95.53 \% & 89.51 \% & 0.08 s / 1 core & \\
SegVoxelNet & & 92.16 \% & 95.86 \% & 86.90 \% & 0.04 s / 1 core & H. Yi, S. Shi, M. Ding, J. Sun, K. Xu, H. Zhou, Z. Wang, S. Li and G. Wang: SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud. ICRA 2020.\\
CP & la & 92.16 \% & 96.05 \% & 87.22 \% & 0.1 s / 1 core & \\
PointRGCN & & 92.15 \% & 97.48 \% & 86.83 \% & 0.26 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
LZY\_RCNN & & 92.06 \% & 93.39 \% & 89.45 \% & 0.08 s / 1 core & \\
RethinkDet3D & & 92.04 \% & 95.68 \% & 86.97 \% & 0.15 s / 1 core & \\
F-ConvNet & la & 91.98 \% & 95.81 \% & 79.83 \% & 0.47 s / GPU & Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. IROS 2019.\\
PointCSE & & 91.95 \% & 95.52 \% & 86.75 \% & 0.02 s / 1 core & \\
IE-PointRCNN & & 91.94 \% & 96.00 \% & 86.84 \% & 0.1 s / 1 core & \\
AB3DMOT & la on & 91.87 \% & 95.86 \% & 86.78 \% & 0.0047s / 1 core & X. Weng and K. Kitani: A Baseline for 3D Multi-Object Tracking. arXiv:1907.03961 2019.\\
MMLab-PointRCNN & la & 91.77 \% & 95.90 \% & 86.92 \% & 0.1 s / GPU & S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.\\
MMLab-PartA^2 & la & 91.73 \% & 95.00 \% & 88.86 \% & 0.08 s / GPU & S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.\\
C-GCN & & 91.57 \% & 95.63 \% & 86.13 \% & 0.147 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
CU-PointRCNN & & 91.25 \% & 97.24 \% & 86.85 \% & 0.1 s / GPU & \\
RUC & & 91.25 \% & 95.01 \% & 88.14 \% & 0.12 s / 1 core & \\
deprecated & & 91.18 \% & 96.19 \% & 83.25 \% & 0.05 s / 1 core & \\
3DBN\_2 & & 91.05 \% & 94.89 \% & 88.42 \% & 0.12 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
deprecated & & 91.02 \% & 94.06 \% & 78.56 \% & 0.05 s / GPU & \\
Mono3CN & & 90.96 \% & 94.22 \% & 82.86 \% & 0.1 s / 1 core & \\
HRI-VoxelFPN & & 90.76 \% & 96.35 \% & 85.37 \% & 0.02 s / GPU & H. Kuang, B. Wang, J. An, M. Zhang and Z. Zhang: Voxel-FPN:multi-scale voxel feature aggregation in 3D object detection from point clouds. sensors 2020.\\
SSL-RTM3D & & 90.70 \% & 96.34 \% & 80.72 \% & 0.03 s / 1 core & \\
anonymous & & 90.70 \% & 96.46 \% & 82.39 \% & 1 s / 1 core & \\
PointPillars & la & 90.70 \% & 93.84 \% & 87.47 \% & 16 ms / & A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from Point Clouds. CVPR 2019.\\
WS3D & la & 90.69 \% & 94.85 \% & 85.94 \% & 0.1 s / GPU & \\
CentrNet-v1 & la & 90.48 \% & 93.79 \% & 87.43 \% & 0.03 s / GPU & \\
DDB & la & 90.38 \% & 93.21 \% & 86.42 \% & 0.05 s / GPU & \\
OACV & & 90.35 \% & 93.95 \% & 81.90 \% & 0.23 s / GPU & \\
autonet & & 90.31 \% & 93.30 \% & 87.00 \% & 0.12 s / 1 core & \\
MVSLN & & 90.26 \% & 95.95 \% & 82.75 \% & 0.1s s / 1 core & \\
3D IoU Loss & la & 90.21 \% & 95.60 \% & 84.96 \% & 0.08 s / GPU & D. Zhou, J. Fang, X. Song, C. Guan, J. Yin, Y. Dai and R. Yang: IoU Loss for 2D/3D Object Detection. International Conference on 3D Vision (3DV) 2019.\\
Bit & & 90.19 \% & 93.42 \% & 86.48 \% & 0.11 s / 1 core & \\
ARPNET & & 90.11 \% & 93.42 \% & 82.56 \% & 0.08 s / GPU & Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network for 3D object detection. Science China Information Sciences 2019.\\
TANet & & 90.11 \% & 93.52 \% & 84.61 \% & 0.035s / GPU & Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from Point Clouds with Triple Attention. AAAI 2020.\\
EPENet & & 90.09 \% & 93.83 \% & 86.76 \% & 0.04 s / 1 core & \\
FOFNet & la & 90.05 \% & 93.87 \% & 84.52 \% & 0.04 s / GPU & \\
SFB-SECOND & & 90.04 \% & 95.99 \% & 84.70 \% & 0.1 s / 1 core & \\
CentrNet-FG & & 90.04 \% & 93.51 \% & 87.02 \% & 0.03 s / 1 core & \\
PTS & la & 90.03 \% & 95.41 \% & 84.73 \% & 0.01 s / 1 core & \\
CG-Stereo & st & 89.98 \% & 96.28 \% & 82.21 \% & 0.57 s / & \\
Sogo\_MM & & 89.97 \% & 94.15 \% & 79.94 \% & 1.5 s / GPU & \\
Deep3DBox & & 89.88 \% & 94.62 \% & 76.40 \% & 1.5 s / GPU & A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.\\
SECOND-V1.5 & la & 89.88 \% & 95.53 \% & 84.46 \% & 0.04 s / GPU & \\
PointPiallars\_SECA & & 89.86 \% & 92.96 \% & 86.46 \% & 0.06 s / 1 core & \\
Tencent\_ADlab\_Lidar & la & 89.82 \% & 93.37 \% & 85.67 \% & 0.1 s / GPU & \\
VOXEL\_FPN\_HR & & 89.81 \% & 93.52 \% & 84.59 \% & 0.12 s / 8 cores & ERROR: Wrong syntax in BIBTEX file.\\
BVVF & & 89.77 \% & 95.55 \% & 84.48 \% & 0.1 s / 1 core & \\
baseline & & 89.69 \% & 92.61 \% & 86.03 \% & 0.12 s / 1 core & \\
GPP & & 89.68 \% & 93.94 \% & 80.60 \% & 0.23 s / GPU & A. Rangesh and M. Trivedi: Ground plane polling for 6dof pose estimation of objects on the road. arXiv preprint arXiv:1811.06666 2018.\\
SubCNN & & 89.53 \% & 94.11 \% & 79.14 \% & 2 s / GPU & Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.\\
FCY & la & 89.49 \% & 93.02 \% & 85.72 \% & 0.02 s / GPU & \\
SAANet & & 89.46 \% & 95.64 \% & 82.12 \% & 0.10 s / 1 core & \\
SCNet & la & 89.36 \% & 95.23 \% & 84.03 \% & 0.04 s / GPU & Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.\\
RUC & & 89.26 \% & 92.28 \% & 85.38 \% & 0.12 s / 1 core & \\
AVOD & la & 89.22 \% & 94.98 \% & 82.14 \% & 0.08 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.\\
RUC & & 88.90 \% & 92.68 \% & 84.04 \% & 0.12 s / 1 core & \\
PAD & & 88.71 \% & 93.09 \% & 84.86 \% & 0.15 s / 1 core & \\
AVOD-FPN & la & 88.61 \% & 94.65 \% & 83.71 \% & 0.1 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.\\
SS3D\_HW & & 88.50 \% & 94.45 \% & 68.61 \% & 0.4 s / GPU & \\
PSMD & & 88.29 \% & 93.59 \% & 75.35 \% & 0.1 s / GPU & \\
Prune & & 88.10 \% & 93.86 \% & 80.41 \% & 0.11 s / 1 core & \\
autoRUC & & 88.03 \% & 93.80 \% & 80.36 \% & 0.12 s / 1 core & \\
PointRes & la gp on & 87.83 \% & 95.24 \% & 83.39 \% & 0.013 s / 1 core & \\
DeepStereoOP & & 87.81 \% & 93.68 \% & 77.60 \% & 3.4 s / GPU & C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.\\
3DBN & la & 87.59 \% & 93.34 \% & 79.91 \% & 0.13s / & X. Li, J. Guivant, N. Kwok and Y. Xu: 3D Backbone Network for 3D Object Detection. CoRR 2019.\\
FQNet & & 87.49 \% & 93.66 \% & 73.61 \% & 0.5 s / 1 core & L. Liu, J. Lu, C. Xu, Q. Tian and J. Zhou: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.\\
Shift R-CNN (mono) & & 87.47 \% & 93.75 \% & 77.19 \% & 0.25 s / GPU & A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D Object Detection With Closed-form Geometric Constraints. ICIP 2019.\\
PP-3D & & 87.46 \% & 93.09 \% & 79.88 \% & 0.1 s / 1 core & \\
MonoPSR & & 87.45 \% & 93.29 \% & 72.26 \% & 0.2 s / GPU & J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction. CVPR 2019.\\
Mono3D & & 87.28 \% & 93.13 \% & 77.00 \% & 4.2 s / GPU & X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.\\
Stereo3D & st & 87.26 \% & 93.70 \% & 67.54 \% & 0.08 s / 1 core & \\
3DNN & & 87.08 \% & 93.78 \% & 79.72 \% & 0.09 s / GPU & \\
SMOKE & & 87.02 \% & 92.94 \% & 77.12 \% & 0.03 s / GPU & Z. Liu, Z. Wu and R. Tóth: SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. 2020.\\
MonoSS & & 86.95 \% & 92.88 \% & 77.04 \% & 0.03 s / GPU & \\
3DOP & st & 86.93 \% & 91.31 \% & 76.72 \% & 3s / GPU & X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.\\
RTM3D & & 86.73 \% & 91.75 \% & 77.18 \% & 0.05 s / GPU & P. Li, H. Zhao, P. Liu and F. Cao: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving. 2020.\\
voxelrcnn & & 86.61 \% & 94.59 \% & 79.80 \% & 15 s / 1 core & \\
MBR-SSD & & 86.57 \% & 90.97 \% & 78.03 \% & 4.0 s / GPU & \\
MonoPair & & 86.11 \% & 91.65 \% & 76.45 \% & 0.06 s / GPU & Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.\\
DSGN & st & 86.03 \% & 95.42 \% & 78.27 \% & 0.67 s / & Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D Object Detection. CVPR 2020.\\
NL\_M3D & & 85.32 \% & 90.88 \% & 70.87 \% & 0.2 s / 1 core & \\
StereoFENet & st & 85.14 \% & 91.28 \% & 76.80 \% & 0.15 s / 1 core & W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.\\
PL++ (SDN+GDC) & st la & 84.42 \% & 94.83 \% & 76.95 \% & 0.6 s / GPU & Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.\\
SS3D & & 84.38 \% & 92.57 \% & 69.82 \% & 48 ms / & E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. CoRR 2019.\\
IDA-3D & st & 84.32 \% & 92.63 \% & 73.98 \% & 0.08 s / 1 core & \\
MonoFENet & & 84.09 \% & 91.42 \% & 75.93 \% & 0.15 s / 1 core & W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.\\
SECA & & 83.99 \% & 92.34 \% & 78.85 \% & 1 s / GPU & \\
Complexer-YOLO & la & 83.89 \% & 91.77 \% & 79.24 \% & 0.06 s / GPU & M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019.\\
ZoomNet & st & 83.79 \% & 94.14 \% & 68.78 \% & 0.3 s / 1 core & L. Z. Xu: ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2020.\\
seivl & & 83.38 \% & 90.32 \% & 81.41 \% & 0.1 s / 1 core & \\
M3D-RPN & & 82.81 \% & 88.38 \% & 67.08 \% & 0.16 s / GPU & G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal Network for Object Detection . ICCV 2019 .\\
RAR-Net & & 82.63 \% & 88.40 \% & 66.90 \% & 0.5 s / 1 core & \\
SSL-RTM3D Res18 & & 82.43 \% & 93.13 \% & 72.47 \% & 0.02 s / GPU & \\
ASOD & & 82.13 \% & 93.56 \% & 67.32 \% & 0.28 s / GPU & \\
D4LCN & & 82.08 \% & 90.01 \% & 63.98 \% & 0.2 s / GPU & M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for Monocular 3D Object Detection. CVPR 2020.\\
deprecated & & 81.99 \% & 92.07 \% & 67.48 \% & / 1 core & \\
S3D & & 81.93 \% & 91.59 \% & 67.43 \% & 0.1 s / 1 core & \\
Pseudo-LiDAR++ & st & 81.87 \% & 94.14 \% & 74.29 \% & 0.4 s / GPU & Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.\\
LNET & & 81.81 \% & 91.36 \% & 67.33 \% & 0.05 s / 1 core & \\
PG-MonoNet & & 81.77 \% & 87.61 \% & 66.06 \% & 0.19 s / GPU & \\
Disp R-CNN & st & 81.61 \% & 92.91 \% & 69.20 \% & 0.42 s / GPU & J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.\\
Pseudo-LiDAR E2E & st & 81.56 \% & 93.74 \% & 74.23 \% & 0.4 s / GPU & \\
HG-Mono & & 81.53 \% & 88.76 \% & 63.12 \% & 0.46 s / GPU & \\
HR-SECOND & & 81.23 \% & 88.32 \% & 74.89 \% & 0.11 s / 1 core & \\
BS3D & & 81.22 \% & 94.66 \% & 68.39 \% & 22 ms / & N. Gählert, J. Wan, M. Weber, J. Zöllner, U. Franke and J. Denzler: Beyond Bounding Boxes: Using Bounding Shapes for Real-Time 3D Vehicle Detection from Monocular RGB Images. 2019 IEEE Intelligent Vehicles Symposium (IV) 2019.\\
DP3D & & 81.07 \% & 87.49 \% & 65.12 \% & 0.05 s / GPU & \\
DP3D & & 80.87 \% & 87.58 \% & 64.88 \% & 0.07 s / GPU & \\
FRCNN+Or & & 80.57 \% & 91.50 \% & 67.49 \% & 0.09 s / & C. Guindel, D. Martin and J. Armingol: Fast Joint Object Detection and Viewpoint Estimation for Traffic Scene Understanding. IEEE Intelligent Transportation Systems Magazine 2018.C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.\\
UM3D\_TUM & & 80.15 \% & 92.80 \% & 65.77 \% & 0.05 s / 1 core & \\
YoloMono3D & & 78.50 \% & 91.43 \% & 58.80 \% & 0.05 s / GPU & \\
3D-GCK & & 78.44 \% & 88.59 \% & 66.28 \% & 24 ms / & \\
3D-SSMFCNN & & 77.82 \% & 77.84 \% & 68.67 \% & 0.1 s / GPU & L. Novak: Vehicle Detection and Pose Estimation for Autonomous Driving. 2017.\\
DA-3Ddet & & 77.73 \% & 89.01 \% & 61.48 \% & 0.4 s / GPU & \\
3DVP & & 75.71 \% & 84.44 \% & 64.41 \% & 40 s / 8 cores & Y. Xiang, W. Choi, Y. Lin and S. Savarese: Data-Driven 3D Voxel Patterns for Object Category Recognition. IEEE Conference on Computer Vision and Pattern Recognition 2015.\\
GS3D & & 75.63 \% & 85.79 \% & 61.85 \% & 2 s / 1 core & B. Li, W. Ouyang, L. Sheng, X. Zeng and X. Wang: GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
Pose-RCNN & & 75.41 \% & 89.49 \% & 63.57 \% & 2 s / >8 cores & M. Braun, Q. Rao, Y. Wang and F. Flohr: Pose-RCNN: Joint object detection and pose estimation using 3D object proposals. Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on 2016.\\
avodC & & 75.35 \% & 86.76 \% & 70.17 \% & 0.1 s / GPU & \\
SubCat & & 75.26 \% & 83.31 \% & 59.55 \% & 0.7 s / 6 cores & E. Ohn-Bar and M. Trivedi: Learning to Detect Vehicles by Clustering Appearance Patterns. T-ITS 2015.\\
3D FCN & la & 74.54 \% & 86.65 \% & 67.73 \% & >5 s / 1 core & B. Li: 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. IROS 2017.\\
OC Stereo & st & 73.34 \% & 86.86 \% & 61.37 \% & 0.35 s / 1 core & A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D Object Detection. ICRA 2020.\\
BdCost+DA+BB+MS & & 72.87 \% & 84.39 \% & 57.07 \% & TBD s / 4 cores & \\
BdCost+DA+MS & & 72.65 \% & 84.06 \% & 58.08 \% & TBD s / 4 cores & \\
BdCost+DA+BB & & 70.07 \% & 84.66 \% & 55.50 \% & TBD s / 4 cores & \\
ROI-10D & & 68.14 \% & 75.32 \% & 58.98 \% & 0.2 s / GPU & F. Manhardt, W. Kehl and A. Gaidon: ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape. Computer Vision and Pattern Recognition (CVPR) 2019.\\
BirdNet+ & la & 67.65 \% & 91.82 \% & 65.11 \% & 0.1 s / & A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View. arXiv:2003.04188 [cs.CV] 2020.\\
multi-task CNN & & 67.51 \% & 79.00 \% & 58.80 \% & 25.1 ms / GPU & M. Oeljeklaus, F. Hoffmann and T. Bertram: A Fast Multi-Task CNN for Spatial Understanding of Traffic Scenes. IEEE Intelligent Transportation Systems Conference 2018.\\
Decoupled-3D v2 & & 67.47 \% & 88.23 \% & 54.04 \% & 0.08 s / GPU & \\
Decoupled-3D & & 67.23 \% & 87.34 \% & 53.84 \% & 0.08 s / GPU & Y. Cai, B. Li, Z. Jiao, H. Li, X. Zeng and X. Wang: Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation. AAAI 2020.\\
BdCost48LDCF & & 65.50 \% & 80.44 \% & 51.24 \% & 0.5 s / 8 cores & A. Fernández-Baldera, J. Buenaposada and L. Baumela: BAdaCost: Multi-class Boosting with Costs . Pattern Recognition 2018.\\
OC-DPM & & 65.32 \% & 77.35 \% & 51.00 \% & 10 s / 8 cores & B. Pepik, M. Stark, P. Gehler and B. Schiele: Occlusion Patterns for Object Class Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013.\\
deprecated & & 65.30 \% & 69.02 \% & 63.66 \% & 0.05 s / GPU & \\
3DVSSD & & 65.28 \% & 79.56 \% & 55.73 \% & 0.06 s / 1 core & \\
RefinedMPL & & 64.02 \% & 87.95 \% & 52.06 \% & 0.15 s / GPU & J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.\\
BdCost48-25C & & 63.90 \% & 80.69 \% & 51.54 \% & 4 s / 1 core & \\
DPM-VOC+VP & & 63.58 \% & 79.09 \% & 46.59 \% & 8 s / 1 core & B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.\\
AOG-View & & 62.62 \% & 77.62 \% & 48.27 \% & 3 s / 1 core & B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.\\
monoref3d & & 58.30 \% & 77.65 \% & 46.90 \% & 0.1 s / 1 core & \\
ref3D & & 58.30 \% & 77.65 \% & 46.90 \% & 0.1 s / 1 core & \\
LSVM-MDPM-sv & & 57.48 \% & 70.23 \% & 42.54 \% & 10 s / 4 cores & P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.\\
SAMME48LDCF & & 57.26 \% & 76.28 \% & 43.55 \% & 0.5 s / 8 cores & A. Fernández-Baldera, J. Buenaposada and L. Baumela: BAdaCost: Multi-class Boosting with Costs . Pattern Recognition 2018.\\
deprecated & & 57.01 \% & 62.54 \% & 54.94 \% & - / & \\
BirdNet & la & 56.94 \% & 79.20 \% & 54.88 \% & 0.11 s / & J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework from LiDAR Information. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.\\
ref3D & & 56.49 \% & 77.52 \% & 45.17 \% & 0.1 s / 1 core & \\
DEFT & & 51.66 \% & 57.41 \% & 50.02 \% & 1 s / GPU & \\
VeloFCN & la & 51.05 \% & 70.03 \% & 44.82 \% & 1 s / GPU & B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .\\
Mono3D\_PLiDAR & & 49.39 \% & 76.90 \% & 41.13 \% & 0.1 s / & X. Weng and K. Kitani: Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. arXiv:1903.09847 2019.\\
DPM-C8B1 & st & 48.00 \% & 57.76 \% & 35.52 \% & 15 s / 4 cores & J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.\\
LTN & & 46.54 \% & 48.96 \% & 41.58 \% & 0.4 s / GPU & T. Wang, X. He, Y. Cai and G. Xiao: Learning a Layout Transfer Network for Context Aware Object Detection. IEEE Transactions on Intelligent Transportation Systems 2019.\\
sensekitti & & 46.12 \% & 49.16 \% & 42.79 \% & 4.5 s / GPU & B. Yang, J. Yan, Z. Lei and S. Li: Craft Objects from Images. CVPR 2016.\\
ReSqueeze & & 45.58 \% & 49.08 \% & 41.33 \% & 0.03 s / GPU & \\
Resnet101Faster rcnn & & 44.01 \% & 51.21 \% & 39.19 \% & 1 s / 1 core & \\
anonymous & & 40.75 \% & 45.00 \% & 34.48 \% & 1 s / 1 core & \\
Chovy & & 40.34 \% & 41.64 \% & 38.31 \% & 0.04 s / GPU & \\
cvMax & & 40.31 \% & 41.97 \% & 37.57 \% & 0.04 s / GPU & \\
deprecated & & 40.03 \% & 40.31 \% & 37.35 \% & 0.04 s / GPU & \\
3D-CVF at SPA & la & 39.79 \% & 40.44 \% & 36.10 \% & 0.06 s / 1 core & J. Yoo, Y. Kim, J. Kim and J. Choi: 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection. arXiv preprint arXiv:2004.12636 2020.\\
deprecated & & 38.89 \% & 40.49 \% & 35.13 \% & 0.06 s / GPU & \\
FD2 & & 38.89 \% & 48.29 \% & 34.35 \% & 0.01 s / GPU & \\
bin & & 38.58 \% & 43.36 \% & 32.42 \% & 15ms s / GPU & \\
PVF-NET & & 38.53 \% & 39.57 \% & 38.23 \% & 0.1 s / 1 core & \\
DGIST-CellBox & & 38.36 \% & 39.11 \% & 36.15 \% & 0.1 s / GPU & \\
SA-SSD & & 38.30 \% & 39.40 \% & 37.07 \% & 0.04 s / 1 core & C. He, H. Zeng, J. Huang, X. Hua and L. Zhang: Structure Aware Single-stage 3D Object Detection from Point Cloud. CVPR 2020.\\
dgist\_multiDetNet & & 38.13 \% & 38.99 \% & 35.36 \% & 0.05 s / 1 core & \\
Faster RCNN + A & & 37.92 \% & 39.50 \% & 33.85 \% & 0.19 s / GPU & \\
KNN-GCNN & & 37.80 \% & 38.80 \% & 36.52 \% & 0.4 s / 1 core & \\
JSU-NET & & 37.60 \% & 41.33 \% & 33.41 \% & 0.1 s / 1 core & \\
Faster RCNN + G & & 37.49 \% & 39.05 \% & 33.40 \% & 1.1 s / GPU & \\
Faster RCNN + A & & 37.35 \% & 38.75 \% & 33.38 \% & 0.19 s / GPU & \\
yolo4 & & 37.27 \% & 38.19 \% & 32.45 \% & 0.02 s / 1 core & \\
Point-GNN & la & 37.20 \% & 38.66 \% & 36.29 \% & 0.6 s / GPU & W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. CVPR 2020.\\
F-3DNet & & 37.18 \% & 38.58 \% & 36.44 \% & 0.5 s / GPU & \\
GAFM & & 37.08 \% & 40.28 \% & 33.08 \% & 0.5 s / 1 core & \\
CRCNNA & & 37.04 \% & 40.19 \% & 32.03 \% & 0.1 s / 1 core & \\
Faster RCNN + Gr + A & & 36.95 \% & 38.22 \% & 33.16 \% & 1.29 s / GPU & \\
CSFADet & & 36.83 \% & 39.76 \% & 32.73 \% & 0.05 s / GPU & \\
cas\_retina & & 36.63 \% & 39.70 \% & 31.52 \% & 0.2 s / 4 cores & \\
GA\_BALANCE & & 36.62 \% & 38.44 \% & 31.94 \% & 1 s / 1 core & \\
GA\_rpn500 & & 36.54 \% & 38.33 \% & 32.67 \% & 1 s / 1 core & \\
GA2500 & & 36.54 \% & 38.33 \% & 32.67 \% & 0.2 s / 1 core & \\
cas+res+soft & & 36.53 \% & 38.82 \% & 32.26 \% & 0.2 s / 4 cores & \\
merge12-12 & & 36.47 \% & 38.83 \% & 32.20 \% & 0.2 s / 4 cores & \\
GA\_FULLDATA & & 36.43 \% & 38.90 \% & 31.61 \% & 1 s / 4 cores & \\
AtrousDet & & 36.36 \% & 38.86 \% & 31.79 \% & 0.05 s / & \\
bigger\_ga & & 36.21 \% & 38.41 \% & 31.58 \% & 1 s / 1 core & \\
cas\_retina\_1\_13 & & 35.89 \% & 39.02 \% & 31.33 \% & 0.03 s / 4 cores & \\
cascadercnn & & 35.61 \% & 36.22 \% & 30.16 \% & 0.36 s / 4 cores & \\
Cmerge & & 35.02 \% & 38.33 \% & 29.06 \% & 0.2 s / 4 cores & \\
ga50 & & 34.95 \% & 38.21 \% & 30.29 \% & 1 s / 1 core & \\
softretina & & 34.57 \% & 39.31 \% & 29.27 \% & 0.16 s / 4 cores & \\
Retinanet100 & & 34.37 \% & 39.15 \% & 28.43 \% & 0.2 s / 4 cores & \\
ZKNet & & 34.27 \% & 38.09 \% & 29.93 \% & 0.01 s / GPU & \\
bifpn\_fsrn & & 33.84 \% & 37.56 \% & 29.98 \% & 0.07 s / 1 core & \\
LPN & & 33.61 \% & 34.57 \% & 29.72 \% & 0.2 s / GPU & \\
cascade\_gw & & 33.53 \% & 34.76 \% & 29.71 \% & 0.2 s / 4 cores & \\
RADNet-Fusion & la & 33.31 \% & 31.96 \% & 32.72 \% & 0.1 s / 1 core & \\
RADNet-LIDAR & la & 33.08 \% & 31.30 \% & 32.31 \% & 0.1 s / 1 core & \\
SceneNet & & 32.78 \% & 37.79 \% & 28.30 \% & 0.03 s / GPU & \\
MTDP & & 32.68 \% & 36.06 \% & 27.12 \% & 0.15 s / GPU & \\
CBNet & & 32.63 \% & 36.51 \% & 29.26 \% & 1 s / 4 cores & \\
Fast-SSD & & 32.51 \% & 41.41 \% & 28.45 \% & 0.06 s / & \\
centernet & & 32.22 \% & 35.79 \% & 28.50 \% & 0.01 s / GPU & \\
RFCN\_RFB & & 32.06 \% & 35.39 \% & 27.94 \% & 0.2 s / 4 cores & \\
FailNet-Fusion & la & 31.68 \% & 30.84 \% & 30.56 \% & 0.1 s / 1 core & \\
MTNAS & & 31.15 \% & 35.43 \% & 27.02 \% & 0.02 s / 1 core & \\
yolo800 & & 31.13 \% & 32.49 \% & 26.76 \% & 0.13 s / 4 cores & \\
FailNet-LIDAR & la & 31.10 \% & 30.32 \% & 29.89 \% & 0.1 s / 1 core & \\
VoxelNet(Unofficial) & & 31.08 \% & 34.54 \% & 28.79 \% & 0.5 s / GPU & \\
SAIC-SA-3D & la & 31.02 \% & 41.38 \% & 29.60 \% & 0.05 s / GPU & \\
RFCN & & 30.93 \% & 34.24 \% & 25.27 \% & 0.2 s / 4 cores & \\
AOG & & 29.81 \% & 33.28 \% & 23.91 \% & 3 s / 4 cores & T. Wu, B. Li and S. Zhu: Learning And-Or Models to Represent Context and Occlusion for Car Detection and Viewpoint Estimation. TPAMI 2016.B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.\\
m-prcnn & st & 29.62 \% & 34.80 \% & 22.79 \% & 0.43 s / 1 core & \\
Multi-task DG & & 29.49 \% & 36.06 \% & 26.06 \% & 0.06 s / GPU & \\
DAM & & 28.97 \% & 37.05 \% & 25.28 \% & 1 s / GPU & \\
fasterrcnn & & 28.42 \% & 30.28 \% & 24.95 \% & 0.2 s / 4 cores & \\
RFBnet & & 27.91 \% & 34.44 \% & 25.24 \% & 0.2 s / 4 cores & \\
E-VoxelNet & & 26.87 \% & 27.66 \% & 24.05 \% & 0.1 s / GPU & \\
SubCat48LDCF & & 26.68 \% & 34.33 \% & 19.44 \% & 0.5 s / 8 cores & A. Fernández-Baldera, J. Buenaposada and L. Baumela: BAdaCost: Multi-class Boosting with Costs . Pattern Recognition 2018.\\
Lidar\_ROI+Yolo(UJS) & & 25.33 \% & 30.36 \% & 22.20 \% & 0.1 s / 1 core & \\
RADNet-Mono & & 24.78 \% & 28.55 \% & 22.84 \% & 0.1 s / 1 core & \\
RT3D-GMP & st & 24.27 \% & 28.33 \% & 18.51 \% & 0.06 s / GPU & \\
100Frcnn & & 23.32 \% & 32.81 \% & 19.45 \% & 2 s / 4 cores & \\
RT3DStereo & st & 21.41 \% & 25.58 \% & 17.52 \% & 0.08 s / GPU & H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.\\
CSoR & la & 20.82 \% & 30.65 \% & 17.14 \% & 3.5 s / 4 cores & L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.\\
FailNet-Mono & & 19.63 \% & 25.13 \% & 17.19 \% & 0.1 s / 1 core & \\
RT3D & la & 18.96 \% & 24.41 \% & 19.85 \% & 0.09 s / GPU & Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li and N. Sun: RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving. IEEE Robotics and Automation Letters 2018.\\
softyolo & & 18.31 \% & 26.80 \% & 15.28 \% & 0.16 s / 4 cores & \\
Licar & la & 16.16 \% & 18.56 \% & 15.59 \% & 0.09 s / GPU & \\
VoxelJones & & 15.41 \% & 17.83 \% & 14.13 \% & .18 s / 1 core & M. Motro and J. Ghosh: Vehicular Multi-object Tracking with Persistent Detector Failures. arXiv preprint arXiv:1907.11306 2019.\\
KD53-20 & & 13.76 \% & 20.58 \% & 11.91 \% & 0.19 s / 4 cores & \\
Scan\_YOLO & & 9.08 \% & 10.19 \% & 8.40 \% & 0.1 s / 4 cores & \\
MuRF & & 1.75 \% & 0.63 \% & 2.14 \% & 0.05 s / GPU & \\
MP & & 1.51 \% & 0.63 \% & 2.03 \% & 0.2 s / 1 core & \\
PiP & & 1.45 \% & 0.56 \% & 1.85 \% & 0.05 s / 1 core & \\
Simple3D Net & & 1.38 \% & 0.63 \% & 1.76 \% & 0.02 s / 1 core & \\
SPA & & 1.25 \% & 0.59 \% & 1.64 \% & 0.1 s / 1 core & \\
Associate-3Ddet & & 1.20 \% & 0.52 \% & 1.38 \% & 0.05 s / 1 core & L. Du*, X. Ye*, X. Tan, J. Feng, Z. Xu, E. Ding and S. Wen: Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection. CVPR 2020.\\
FCPP & & 0.06 \% & 0.00 \% & 0.07 \% & 0.02 s / 1 core & \\
JSyolo & & 0.00 \% & 0.00 \% & 0.00 \% & 0.16 s / 4 cores &
\end{tabular}