\begin{tabular}{c | c | c | c | c | c | c}
{\bf Method} & {\bf Setting} & {\bf Moderate} & {\bf Easy} & {\bf Hard} & {\bf Runtime} & {\bf Environment}\\ \hline
VirConv-S & & 93.52 \% & 95.99 \% & 90.38 \% & 0.09 s / 1 core & H. Wu, C. Wen, S. Shi and C. Wang: Virtual Sparse Convolution for Multimodal 3D Object Detection. CVPR 2023.\\
UDeerPEP & & 93.40 \% & 95.34 \% & 89.07 \% & 0.1 s / 1 core & Z. Dong, H. Ji, X. Huang, W. Zhang, X. Zhan and J. Chen: PeP: a Point enhanced Painting method for unified point cloud tasks. 2023.\\
VirConv-T & & 92.65 \% & 96.11 \% & 89.69 \% & 0.09 s / 1 core & H. Wu, C. Wen, S. Shi and C. Wang: Virtual Sparse Convolution for Multimodal 3D Object Detection. CVPR 2023.\\
GraR-Po & & 92.12 \% & 95.79 \% & 87.11 \% & 0.06 s / 1 core & H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.\\
TSSTDet & & 92.11 \% & 95.80 \% & 89.23 \% & 0.08 s / 1 core & H. Hoang, D. Bui and M. Yoo: TSSTDet: Transformation-Based 3-D Object Detection via a Spatial Shape Transformer. IEEE Sensors Journal 2024.\\
MPCF & & 92.07 \% & 95.92 \% & 87.29 \% & 0.08 s / 1 core & \\
TED & & 92.05 \% & 95.44 \% & 87.30 \% & 0.1 s / 1 core & H. Wu, C. Wen, W. Li, R. Yang and C. Wang: Transformation-Equivariant 3D Object Detection for Autonomous Driving. AAAI 2023.\\
MB3D & & 91.93 \% & 95.33 \% & 88.71 \% & 0.09 s / 1 core & \\
PVFusion & & 91.87 \% & 95.01 \% & 86.96 \% & 0.01 s / 1 core & \\
VPFNet & & 91.86 \% & 93.02 \% & 86.94 \% & 0.06 s / 2 cores & H. Zhu, J. Deng, Y. Zhang, J. Ji, Q. Mao, H. Li and Y. Zhang: VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion. IEEE Transactions on Multimedia 2022.\\
SFD & & 91.85 \% & 95.64 \% & 86.83 \% & 0.1 s / 1 core & X. Wu, L. Peng, H. Yang, L. Xie, C. Huang, C. Deng, H. Liu and D. Cai: Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion. CVPR 2022.\\
SE-SSD & la & 91.84 \% & 95.68 \% & 86.72 \% & 0.03 s / 1 core & W. Zheng, W. Tang, L. Jiang and C. Fu: SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud. CVPR 2021.\\
HDet3D & & 91.82 \% & 94.90 \% & 84.68 \% & 0.07 s / >8 cores & \\
LVP(84.92) & & 91.80 \% & 95.49 \% & 88.91 \% & 0.04 s / 1 core & \\
ACFNet & & 91.78 \% & 92.91 \% & 87.06 \% & 0.11 s / 1 core & Y. Tian, X. Zhang, X. Wang, J. Xu, J. Wang, R. Ai, W. Gu and W. Ding: ACF-Net: Asymmetric Cascade Fusion for 3D Detection With LiDAR Point Clouds and Images. IEEE Transactions on Intelligent Vehicles 2023.\\
GraR-Vo & & 91.72 \% & 95.27 \% & 86.51 \% & 0.04 s / 1 core & H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.\\
NIV-SSD & & 91.69 \% & 95.66 \% & 86.72 \% & 0.03 s / 1 core & \\
PVT-SSD & & 91.63 \% & 95.23 \% & 86.43 \% & 0.05 s / 1 core & H. Yang, W. Wang, M. Chen, B. Lin, T. He, H. Chen, X. He and W. Ouyang: PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer. CVPR 2023.\\
HAF-PVP\_test & & 91.60 \% & 95.33 \% & 86.71 \% & 0.09 s / 1 core & \\
SPANet & & 91.59 \% & 95.59 \% & 86.53 \% & 0.06 s / 1 core & Y. Ye: SPANet: Spatial and Part-Aware Aggregation Network for 3D Object Detection. Pacific Rim International Conference on Artificial Intelligence 2021.\\
CasA & & 91.54 \% & 95.19 \% & 86.82 \% & 0.1 s / 1 core & H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.\\
FEIF3D & la & 91.53 \% & 95.29 \% & 86.87 \% & 0.1 s / GPU & \\
LoGoNet & & 91.52 \% & 95.48 \% & 87.09 \% & 0.1 s / 1 core & X. Li, T. Ma, Y. Hou, B. Shi, Y. Yang, Y. Liu, X. Wu, Q. Chen, Y. Li, Y. Qiao and others: LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion. CVPR 2023.\\
GraR-Pi & & 91.52 \% & 95.06 \% & 86.42 \% & 0.03 s / 1 core & H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.\\
MM-UniMODE & & 91.51 \% & 95.69 \% & 88.71 \% & 0.04 s / 1 core & \\
VIFF-L & & 91.50 \% & 95.44 \% & 88.68 \% & 0.04 s / 1 core & \\
Anonymous & & 91.49 \% & 95.27 \% & 88.93 \% & 0.04 s / 1 core & \\
SCEMF & & 91.46 \% & 94.76 \% & 88.77 \% & 1 s / 1 core & \\
MAK\_VOXEL\_RCNN & & 91.46 \% & 95.32 \% & 86.81 \% & 0.03 s / 1 core & \\
UPIDet & & 91.36 \% & 92.96 \% & 86.80 \% & 0.11 s / 1 core & Y. Zhang, Q. Zhang, J. Hou, Y. Yuan and G. Xing: Unleash the Potential of Image Branch for Cross-modal 3D Object Detection. Thirty-seventh Conference on Neural Information Processing Systems 2023.\\
BADet & & 91.32 \% & 95.23 \% & 86.48 \% & 0.14 s / 1 core & R. Qian, X. Lai and X. Li: BADet: Boundary-Aware 3D Object Detection from Point Clouds. Pattern Recognition 2022.\\
Anonymous & & 91.30 \% & 95.26 \% & 86.73 \% & 0.1 s / 1 core & \\
ANM & & 91.30 \% & 94.91 \% & 88.51 \% & ANM / & \\
DEF-Model & & 91.28 \% & 93.03 \% & 86.48 \% & 0.03 s / 1 core & \\
SSLFusion & & 91.26 \% & 94.86 \% & 88.55 \% & 0.5 s / 1 core & \\
TED-S Reproduced & & 91.23 \% & 95.34 \% & 86.68 \% & 0.1 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
URFormer & & 91.22 \% & 94.40 \% & 86.35 \% & 0.1 s / 1 core & \\
CasA++ & & 91.22 \% & 94.57 \% & 88.43 \% & 0.1 s / 1 core & H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.\\
OGMMDet & & 91.21 \% & 95.59 \% & 88.33 \% & 0.01 s / 1 core & \\
voxel\_spark & & 91.18 \% & 94.82 \% & 86.58 \% & 0.04 s / GPU & \\
spark & & 91.13 \% & 94.93 \% & 86.54 \% & 0.1 s / 1 core & \\
3D HANet & & 91.13 \% & 94.33 \% & 86.33 \% & 0.1 s / 1 core & Q. Xia, Y. Chen, G. Cai, G. Chen, D. Xie, J. Su and Z. Wang: 3D HANet: A Flexible 3D Heatmap Auxiliary Network for Object Detection. IEEE Transactions on Geoscience and Remote Sensing 2023.\\
test & & 91.12 \% & 93.93 \% & 86.17 \% & 0.1 s / 1 core & \\
DiffCandiDet & & 91.11 \% & 95.05 \% & 86.45 \% & 0.06 s / GPU & \\
spark\_voxel\_rcnn & & 91.08 \% & 94.61 \% & 86.59 \% & 0.04 s / 1 core & \\
voxel-rcnn+++ & & 91.06 \% & 92.84 \% & 86.27 \% & 0.08 s / GPU & \\
SA-SSD & & 91.03 \% & 95.03 \% & 85.96 \% & 0.04 s / 1 core & C. He, H. Zeng, J. Huang, X. Hua and L. Zhang: Structure Aware Single-stage 3D Object Detection from Point Cloud. CVPR 2020.\\
L-AUG & & 91.00 \% & 94.52 \% & 88.08 \% & 0.1 s / 1 core & T. Cortinhal, I. Gouigah and E. Aksoy: Semantics-aware LiDAR-Only Pseudo Point Cloud Generation for 3D Object Detection. 2023.\\
TED\_S\_baseline & & 90.98 \% & 94.56 \% & 86.41 \% & 0.09 s / 1 core & \\
spark2 & & 90.95 \% & 92.93 \% & 86.44 \% & 0.1 s / 1 core & \\
HS-fusion & & 90.95 \% & 93.77 \% & 87.79 \% & - s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
Voxel\_Spark\_focal\_we & & 90.93 \% & 94.83 \% & 86.45 \% & 0.08 s / 1 core & \\
c2f & & 90.89 \% & 92.31 \% & 86.25 \% & 1 s / 1 core & \\
3D Dual-Fusion & & 90.86 \% & 93.08 \% & 86.44 \% & 0.1 s / 1 core & Y. Kim, K. Park, M. Kim, D. Kum and J. Choi: 3D Dual-Fusion: Dual-Domain Dual-Query Camera-LiDAR Fusion for 3D Object Detection. arXiv preprint arXiv:2211.13529 2022.\\
PR-SSD & & 90.78 \% & 94.23 \% & 86.14 \% & 0.02 s / GPU & \\
MLFusion-VS & & 90.78 \% & 95.10 \% & 88.41 \% & 0.06 s / 1 core & \\
focal & & 90.74 \% & 92.58 \% & 88.36 \% & 100 s / 1 core & \\
GEFPN & & 90.74 \% & 92.58 \% & 88.36 \% & 0.5 s / 1 core & \\
GeVo & & 90.74 \% & 92.58 \% & 88.36 \% & 0.05 s / 1 core & \\
GraphAlign(ICCV2023) & & 90.73 \% & 94.46 \% & 88.34 \% & 0.03 s / GPU & Z. Song, H. Wei, L. Bai, L. Yang and C. Jia: GraphAlign: Enhancing accurate feature alignment by graph matching for multi-modal 3D object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision 2023.\\
GF-pointnet & & 90.67 \% & 93.88 \% & 86.09 \% & 0.02 s / 1 core & \\
SDGUFusion & & 90.65 \% & 95.10 \% & 86.45 \% & 0.5 s / 1 core & \\
MMLab PV-RCNN & la & 90.65 \% & 94.98 \% & 86.14 \% & 0.08 s / 1 core & S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. CVPR 2020.\\
SQD & & 90.63 \% & 95.44 \% & 88.04 \% & 0.06 s / 1 core & \\
AFFN-G & & 90.61 \% & 94.46 \% & 88.12 \% & 0.5 s / 1 core & \\
focalnet & & 90.61 \% & 94.46 \% & 88.12 \% & 0.05 s / 1 core & \\
BPG3D & & 90.57 \% & 93.00 \% & 86.21 \% & 0.05 s / 1 core & \\
focalnet & & 90.56 \% & 94.52 \% & 88.08 \% & 0.05 s / 1 core & \\
VPFNet & & 90.52 \% & 93.94 \% & 86.25 \% & 0.2 s / 1 core & C. Wang, H. Chen and L. Fu: VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection. 2021.C. Wang, H. Chen, Y. Chen, P. Hsiao and L. Fu: VoPiFNet: Voxel-Pixel Fusion Network for Multi-Class 3D Object Detection. IEEE Transactions on Intelligent Transportation Systems 2024.\\
ECA & & 90.50 \% & 93.87 \% & 85.94 \% & 0.08 s / GPU & \\
PDV & & 90.48 \% & 94.56 \% & 86.23 \% & 0.1 s / 1 core & J. Hu, T. Kuai and S. Waslander: Point Density-Aware Voxels for LiDAR 3D Object Detection. CVPR 2022.\\
LGNet-3classes & & 90.44 \% & 94.98 \% & 86.06 \% & 0.11 s / 1 core & \\
test & & 90.39 \% & 94.58 \% & 85.69 \% & 0.04 s / GPU & \\
Spark\_PartA2\_Soft\_fo & & 90.38 \% & 93.90 \% & 85.91 \% & 0.1 s / 1 core & \\
M3DeTR & & 90.37 \% & 94.41 \% & 85.98 \% & n/a s / GPU & T. Guan, J. Wang, S. Lan, R. Chandra, Z. Wu, L. Davis and D. Manocha: M3DeTR: Multi-representation, Multi- scale, Mutual-relation 3D Object Detection with Transformers. 2021.\\
VoTr-TSD & & 90.34 \% & 94.03 \% & 86.14 \% & 0.07 s / 1 core & J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu and C. Xu: Voxel Transformer for 3D Object Detection. ICCV 2021.\\
AFFN & & 90.33 \% & 94.29 \% & 85.99 \% & 0.5 s / 1 core & \\
Spark\_partA22 & & 90.23 \% & 92.61 \% & 85.89 \% & 10 s / 1 core & \\
DSA-PV-RCNN & la & 90.13 \% & 92.42 \% & 85.93 \% & 0.08 s / 1 core & P. Bhattacharyya, C. Huang and K. Czarnecki: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection. 2021.\\
LFT & & 90.12 \% & 95.83 \% & 85.06 \% & 0.1s / 1 core & \\
XView & & 90.12 \% & 92.27 \% & 85.94 \% & 0.1 s / 1 core & L. Xie, G. Xu, D. Cai and X. He: X-view: Non-egocentric Multi-View 3D Object Detector. 2021.\\
SFA-GCL & & 90.12 \% & 95.75 \% & 84.97 \% & 0.04 s / 1 core & \\
SFA-GCL(80) & & 90.11 \% & 95.76 \% & 84.96 \% & 0.04 s / 1 core & \\
GraR-VoI & & 90.10 \% & 95.69 \% & 86.85 \% & 0.07 s / 1 core & H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.\\
HA-PillarNet & & 90.07 \% & 92.73 \% & 85.98 \% & 0.05 s / 1 core & \\
CAT-Det & & 90.07 \% & 92.59 \% & 85.82 \% & 0.3 s / GPU & Y. Zhang, J. Chen and D. Huang: CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection. CVPR 2022.\\
3ONet & & 90.07 \% & 95.87 \% & 85.09 \% & 0.1 s / 1 core & H. Hoang and M. Yoo: 3ONet: 3-D Detector for Occluded Object Under Obstructed Conditions. IEEE Sensors Journal 2023.\\
SFA-GCL(80, k=4) & & 90.04 \% & 95.67 \% & 84.91 \% & 0.04 s / 1 core & \\
spark-part2 & & 90.01 \% & 93.82 \% & 85.89 \% & 0.1 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
SP\_SECOND\_IOU & & 89.95 \% & 92.23 \% & 85.84 \% & 0.04 s / 1 core & \\
CG-SSD & & 89.93 \% & 94.26 \% & 85.76 \% & 0.01 s / 1 core & \\
Anonymous & & 89.91 \% & 93.38 \% & 84.91 \% & 0.04 s / 1 core & \\
OFFNet & & 89.88 \% & 91.62 \% & 85.57 \% & 0.1 s / GPU & \\
SVGA-Net & & 89.88 \% & 92.07 \% & 85.59 \% & 0.03s / 1 core & Q. He, Z. Wang, H. Zeng, Y. Zeng and Y. Liu: SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds. AAAI 2022.\\
EBM3DOD & & 89.86 \% & 95.64 \% & 84.56 \% & 0.12 s / 1 core & F. Gustafsson, M. Danelljan and T. Schön: Accurate 3D Object Detection using Energy- Based Models. arXiv preprint arXiv:2012.04634 2020.\\
CIA-SSD & la & 89.84 \% & 93.74 \% & 82.39 \% & 0.03 s / 1 core & W. Zheng, W. Tang, S. Chen, L. Jiang and C. Fu: CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud. AAAI 2021.\\
MLF-DET & & 89.82 \% & 93.38 \% & 84.78 \% & 0.09 s / 1 core & Z. Lin, Y. Shen, S. Zhou, S. Chen and N. Zheng: MLF-DET: Multi-Level Fusion for Cross- Modal 3D Object Detection. International Conference on Artificial Neural Networks 2023.\\
CLOCs\_PVCas & & 89.80 \% & 93.05 \% & 86.57 \% & 0.1 s / 1 core & S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.\\
VoxelFSD & & 89.79 \% & 92.57 \% & 85.77 \% & 0.08 s / 1 core & \\
GLENet-VR & & 89.76 \% & 93.48 \% & 84.89 \% & 0.04 s / 1 core & Y. Zhang, Q. Zhang, Z. Zhu, J. Hou and Y. Yuan: GLENet: Boosting 3D object detectors with generative label uncertainty estimation. International Journal of Computer Vision 2023.Y. Zhang, J. Hou and Y. Yuan: A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial Attacks. International Journal of Computer Vision 2023.\\
RDIoU & & 89.75 \% & 94.90 \% & 84.67 \% & 0.03 s / 1 core & H. Sheng, S. Cai, N. Zhao, B. Deng, J. Huang, X. Hua, M. Zhao and G. Lee: Rethinking IoU-based Optimization for Single- stage 3D Object Detection. ECCV 2022.\\
PV-RCNN-Plus & & 89.75 \% & 91.93 \% & 85.77 \% & 1 s / 1 core & \\
SFA-GCL(baseline) & & 89.74 \% & 95.55 \% & 84.63 \% & 0.04 s / 1 core & \\
SFA-GCL\_dataaug & & 89.73 \% & 93.44 \% & 84.60 \% & 0.04 s / 1 core & \\
SFA-GCL & & 89.71 \% & 93.53 \% & 84.58 \% & 0.04 s / 1 core & \\
DGEnhCL & & 89.66 \% & 95.21 \% & 84.53 \% & 0.04 s / 1 core & \\
EBM3DOD baseline & & 89.63 \% & 95.44 \% & 84.34 \% & 0.05 s / 1 core & F. Gustafsson, M. Danelljan and T. Schön: Accurate 3D Object Detection using Energy- Based Models. arXiv preprint arXiv:2012.04634 2020.\\
SCNet3D & & 89.61 \% & 93.36 \% & 84.78 \% & 0.08 s / 1 core & \\
VPA & & 89.61 \% & 95.46 \% & 86.81 \% & 0.01 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
MAK & & 89.59 \% & 93.21 \% & 86.84 \% & 0.03 s / GPU & \\
pointpillars\_spark & & 89.57 \% & 92.98 \% & 84.91 \% & 0.02 s / GPU & \\
3D-CVF at SPA & la & 89.56 \% & 93.52 \% & 82.45 \% & 0.06 s / 1 core & J. Yoo, Y. Kim, J. Kim and J. Choi: 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection. ECCV 2020.\\
OcTr & & 89.56 \% & 93.08 \% & 86.74 \% & 0.06 s / GPU & C. Zhou, Y. Zhang, J. Chen and D. Huang: OcTr: Octree-based Transformer for 3D Object Detection. CVPR 2023.\\
Struc info fusion II & & 89.54 \% & 95.26 \% & 82.31 \% & 0.05 s / GPU & P. An, J. Liang, J. Ma, K. Yu and B. Fang: Struc info fusion. Submitted to CVIU 2021.\\
spark\_second\_focal\_w & & 89.53 \% & 91.19 \% & 85.11 \% & 0.1 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
spark\_second & & 89.53 \% & 91.23 \% & 85.02 \% & . s / 1 core & \\
IIOU & & 89.52 \% & 92.90 \% & 84.56 \% & 0.1 s / GPU & \\
spark\_pointpillar & & 89.51 \% & 93.58 \% & 85.03 \% & 0.02 s / GPU & \\
SASA & la & 89.51 \% & 92.87 \% & 86.35 \% & 0.04 s / 1 core & C. Chen, Z. Chen, J. Zhang and D. Tao: SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection. arXiv preprint arXiv:2201.01976 2022.\\
Fast-CLOCs & & 89.49 \% & 93.03 \% & 86.40 \% & 0.1 s / GPU & S. Pang, D. Morris and H. Radha: Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022.\\
IA-SSD (single) & & 89.48 \% & 93.14 \% & 84.42 \% & 0.013 s / 1 core & Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.\\
KPTr & & 89.48 \% & 92.74 \% & 84.50 \% & 0.07 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
CLOCs & & 89.48 \% & 92.91 \% & 86.42 \% & 0.1 s / 1 core & S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.\\
PA3DNet & & 89.46 \% & 93.11 \% & 84.60 \% & 0.1 s / GPU & M. Wang, L. Zhao and Y. Yue: PA3DNet: 3-D Vehicle Detection with Pseudo Shape Segmentation and Adaptive Camera- LiDAR Fusion. IEEE Transactions on Industrial Informatics 2023.\\
PG-RCNN & & 89.46 \% & 93.39 \% & 86.54 \% & 0.06 s / GPU & I. Koo, I. Lee, S. Kim, H. Kim, W. Jeon and C. Kim: PG-RCNN: Semantic Surface Point Generation for 3D Object Detection. 2023.\\
DFAF3D & & 89.45 \% & 93.14 \% & 84.22 \% & 0.05 s / 1 core & Q. Tang, X. Bai, J. Guo, B. Pan and W. Jiang: DFAF3D: A dual-feature-aware anchor-free single-stage 3D detector for point clouds. Image and Vision Computing 2023.\\
DVF-V & & 89.42 \% & 93.12 \% & 86.50 \% & 0.1 s / 1 core & A. Mahmoud, J. Hu and S. Waslander: Dense Voxel Fusion for 3D Object Detection. WACV 2023.\\
Struc info fusion I & & 89.38 \% & 94.91 \% & 84.29 \% & 0.05 s / 1 core & P. An, J. Liang, J. Ma, K. Yu and B. Fang: Struc info fusion. Submitted to CVIU 2021.\\
BtcDet & la & 89.34 \% & 92.81 \% & 84.55 \% & 0.09 s / GPU & Q. Xu, Y. Zhong and U. Neumann: Behind the Curtain: Learning Occluded Shapes for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2022.\\
IA-SSD (multi) & & 89.33 \% & 92.79 \% & 84.35 \% & 0.014 s / 1 core & Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.\\
GSG-FPS & & 89.32 \% & 92.77 \% & 84.27 \% & 0.01 s / 1 core & \\
spark\_second2 & & 89.27 \% & 90.94 \% & 84.85 \% & 10 s / 1 core & \\
ACDet & & 89.21 \% & 92.87 \% & 85.80 \% & 0.05 s / 1 core & J. Xu, G. Wang, X. Zhang and G. Wan: ACDet: Attentive Cross-view Fusion for LiDAR-based 3D Object Detection. 3DV 2022.\\
DVF-PV & & 89.20 \% & 93.08 \% & 86.28 \% & 0.1 s / 1 core & A. Mahmoud, J. Hu and S. Waslander: Dense Voxel Fusion for 3D Object Detection. WACV 2023.\\
Test\_dif & & 89.20 \% & 92.69 \% & 84.23 \% & 0.01 s / 1 core & \\
STD & & 89.19 \% & 94.74 \% & 86.42 \% & 0.08 s / GPU & Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia: STD: Sparse-to-Dense 3D Object Detector for Point Cloud. ICCV 2019.\\
FIRM-Net & & 89.18 \% & 92.56 \% & 86.33 \% & 0.07 s / 1 core & \\
Point-GNN & la & 89.17 \% & 93.11 \% & 83.90 \% & 0.6 s / GPU & W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. CVPR 2020.\\
HMFI & & 89.17 \% & 93.04 \% & 86.37 \% & 0.1 s / 1 core & X. Li, B. Shi, Y. Hou, X. Wu, T. Ma, Y. Li and L. He: Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection. ECCV 2022.\\
sec\_spark & & 89.16 \% & 90.89 \% & 84.84 \% & 0.03 s / GPU & \\
SSL-PointGNN & & 89.16 \% & 92.92 \% & 83.99 \% & 0.56 s / GPU & E. Erçelik, E. Yurtsever, M. Liu, Z. Yang, H. Zhang, P. Topçam, M. Listl, Y. Çaylı and A. Knoll: 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone. arXiv preprint arXiv:2205.00705 2022.\\
SPG\_mini & la & 89.12 \% & 92.80 \% & 86.27 \% & 0.09 s / GPU & Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation. Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) 2021.\\
EQ-PVRCNN & & 89.09 \% & 94.55 \% & 86.42 \% & 0.2 s / GPU & Z. Yang, L. Jiang, Y. Sun, B. Schiele and J. Jia: A Unified Query-based Paradigm for Point Cloud Understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.\\
VoxSeT & & 89.07 \% & 92.70 \% & 86.29 \% & 33 ms / 1 core & C. He, R. Li, S. Li and L. Zhang: Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds. CVPR 2022.\\
RAFDet & & 89.07 \% & 92.64 \% & 85.96 \% & 0.1 s / 1 core & \\
RAFDet & & 89.05 \% & 92.29 \% & 84.35 \% & 0.01 s / 1 core & \\
3DSSD & & 89.02 \% & 92.66 \% & 85.86 \% & 0.04 s / GPU & Z. Yang, Y. Sun, S. Liu and J. Jia: 3DSSD: Point-based 3D Single Stage Object Detector. CVPR 2020.\\
RagNet3D & & 89.01 \% & 92.87 \% & 86.36 \% & 0.05 s / 1 core & \\
EPNet++ & & 89.00 \% & 95.41 \% & 85.73 \% & 0.1 s / GPU & Z. Liu, T. Huang, B. Li, X. Chen, X. Wang and X. Bai: EPNet++: Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 2022.\\
DDF & & 89.00 \% & 92.57 \% & 86.50 \% & 0.1 s / 1 core & \\
IOUFusion & & 89.00 \% & 92.47 \% & 84.10 \% & 0.1 s / GPU & \\
Focals Conv & & 89.00 \% & 92.67 \% & 86.33 \% & 0.1 s / 1 core & Y. Chen, Y. Li, X. Zhang, J. Sun and J. Jia: Focal Sparse Convolutional Networks for 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.\\
RAFDet & & 88.99 \% & 92.23 \% & 84.21 \% & 0.01 s / 1 core & \\
LGNet-Car & & 88.98 \% & 92.83 \% & 86.26 \% & 0.11 s / 1 core & \\
USVLab BSAODet & & 88.90 \% & 92.66 \% & 86.23 \% & 0.04 s / 1 core & W. Xiao, Y. Peng, C. Liu, J. Gao, Y. Wu and X. Li: Balanced Sample Assignment and Objective for Single-Model Multi-Class 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2023.\\
bs & & 88.88 \% & 94.53 \% & 86.00 \% & 0.1 s / 1 core & \\
CZY\_PPF\_Net & & 88.88 \% & 94.68 \% & 86.15 \% & 0.1 s / 1 core & \\
H^23D R-CNN & & 88.87 \% & 92.85 \% & 86.07 \% & 0.03 s / 1 core & J. Deng, W. Zhou, Y. Zhang and H. Li: From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2021.\\
Pyramid R-CNN & & 88.84 \% & 92.19 \% & 86.21 \% & 0.07 s / 1 core & J. Mao, M. Niu, H. Bai, X. Liang, H. Xu and C. Xu: Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection. ICCV 2021.\\
CityBrainLab-CT3D & & 88.83 \% & 92.36 \% & 84.07 \% & 0.07 s / 1 core & H. Sheng, S. Cai, Y. Liu, B. Deng, J. Huang, X. Hua and M. Zhao: Improving 3D Object Detection with Channel- wise Transformer. ICCV 2021.\\
Voxel R-CNN & & 88.83 \% & 94.85 \% & 86.13 \% & 0.04 s / GPU & J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang and H. Li: Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection . AAAI 2021.\\
HVNet & & 88.82 \% & 92.83 \% & 83.38 \% & 0.03 s / GPU & M. Ye, S. Xu and T. Cao: HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection. CVPR 2020.\\
AMVFNet & & 88.82 \% & 92.68 \% & 86.18 \% & 0.04 s / GPU & \\
GD-MAE & & 88.82 \% & 94.22 \% & 83.54 \% & 0.07 s / 1 core & H. Yang, T. He, J. Liu, H. Chen, B. Wu, B. Lin, X. He and W. Ouyang: GD-MAE: Generative Decoder for MAE Pre- training on LiDAR Point Clouds. CVPR 2023.\\
SPG & la & 88.70 \% & 94.33 \% & 85.98 \% & 0.09 s / 1 core & Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation. Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) 2021.\\
MG & & 88.66 \% & 92.64 \% & 83.61 \% & 0.1 s / 1 core & \\
SIENet & & 88.65 \% & 92.38 \% & 86.03 \% & 0.08 s / 1 core & Z. Li, Y. Yao, Z. Quan, W. Yang and J. Xie: SIENet: Spatial Information Enhancement Network for 3D Object Detection from Point Cloud. 2021.\\
P2V-RCNN & & 88.63 \% & 92.72 \% & 86.14 \% & 0.1 s / 1 core & J. Li, S. Luo, Z. Zhu, H. Dai, A. Krylov, Y. Ding and L. Shao: P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds. IEEE Access 2021.\\
FromVoxelToPoint & & 88.61 \% & 92.23 \% & 86.11 \% & 0.1 s / 1 core & J. Li, H. Dai, L. Shao and Y. Ding: From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to- Point Decoder. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.\\
RangeIoUDet & la & 88.59 \% & 92.28 \% & 85.83 \% & 0.02 s / GPU & Z. Liang, Z. Zhang, M. Zhang, X. Zhao and S. Pu: RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union. CVPR 2021.\\
af & & 88.58 \% & 92.43 \% & 86.05 \% & 1 s / GPU & \\
MFB3D & & 88.54 \% & 94.67 \% & 85.75 \% & 0.14 s / 1 core & \\
second\_iou\_baseline & & 88.48 \% & 92.24 \% & 85.57 \% & 0.05 s / 1 core & \\
EPNet & & 88.47 \% & 94.22 \% & 83.69 \% & 0.1 s / 1 core & T. Huang, Z. Liu, X. Chen and X. Bai: EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection. ECCV 2020.\\
CenterNet3D & & 88.46 \% & 91.80 \% & 83.62 \% & 0.04 s / GPU & G. Wang, B. Tian, Y. Ai, T. Xu, L. Chen and D. Cao: CenterNet3D:An Anchor free Object Detector for Autonomous Driving. 2020.\\
FARP-Net & & 88.45 \% & 91.20 \% & 86.01 \% & 0.06 s / GPU & T. Xie, L. Wang, K. Wang, R. Li, X. Zhang, H. Zhang, L. Yang, H. Liu and J. Li: FARP-Net: Local-Global Feature Aggregation and Relation-Aware Proposals for 3D Object Detection. IEEE Transactions on Multimedia 2023.\\
PUDet & & 88.42 \% & 92.68 \% & 83.70 \% & 0.3 s / GPU & \\
AFFN-Ga & & 88.41 \% & 92.49 \% & 85.89 \% & 0.5 s / 1 core & \\
RangeRCNN & la & 88.40 \% & 92.15 \% & 85.74 \% & 0.06 s / GPU & Z. Liang, M. Zhang, Z. Zhang, X. Zhao and S. Pu: RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation. arXiv preprint arXiv:2009.00206 2020.\\
second\_iou\_baseline & & 88.40 \% & 92.12 \% & 85.54 \% & 0.03 s / 1 core & \\
Patches & la & 88.39 \% & 92.72 \% & 83.19 \% & 0.15 s / GPU & J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.\\
3D IoU-Net & & 88.38 \% & 94.76 \% & 81.93 \% & 0.1 s / 1 core & J. Li, S. Luo, Z. Zhu, H. Dai, S. Krylov, Y. Ding and L. Shao: 3D IoU-Net: IoU Guided 3D Object Detector for Point Clouds. arXiv preprint arXiv:2004.04962 2020.\\
StructuralIF & & 88.38 \% & 91.78 \% & 85.67 \% & 0.02 s / 8 cores & J. Pei An: Deep structural information fusion for 3D object detection on LiDAR-camera system. Accepted in CVIU 2021.\\
PASS-PV-RCNN-Plus & & 88.37 \% & 92.17 \% & 85.75 \% & 1 s / 1 core & Anonymous: Leveraging Anchor-based LiDAR 3D Object Detection via Point Assisted Sample Selection. will submit to computer vision conference/journal 2024.\\
AAMVFNet & & 88.36 \% & 92.31 \% & 85.81 \% & 0.04 s / GPU & \\
CLOCs\_SecCas & & 88.23 \% & 91.16 \% & 82.63 \% & 0.1 s / 1 core & S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.\\
UberATG-MMF & la & 88.21 \% & 93.67 \% & 81.99 \% & 0.08 s / GPU & M. Liang*, B. Yang*, Y. Chen, R. Hu and R. Urtasun: Multi-Task Multi-Sensor Fusion for 3D Object Detection. CVPR 2019.\\
Patches - EMP & la & 88.17 \% & 94.49 \% & 84.75 \% & 0.5 s / GPU & J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.\\
SRDL & & 88.17 \% & 92.01 \% & 85.43 \% & 0.05 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
Res3DNet & & 88.16 \% & 91.71 \% & 84.85 \% & 0.05 s / GPU & \\
P2P & & 88.15 \% & 91.92 \% & 81.12 \% & 0.1 s / GPU & \\
PointPainting & la & 88.11 \% & 92.45 \% & 83.36 \% & 0.4 s / GPU & S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object Detection. CVPR 2020.\\
SERCNN & la & 88.10 \% & 94.11 \% & 83.43 \% & 0.1 s / 1 core & D. Zhou, J. Fang, X. Song, L. Liu, J. Yin, Y. Dai, H. Li and R. Yang: Joint 3D Instance Segmentation and Object Detection for Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020.\\
PartA2\_basline & & 88.09 \% & 92.35 \% & 85.42 \% & 0.09 s / 1 core & \\
Associate-3Ddet & & 88.09 \% & 91.40 \% & 82.96 \% & 0.05 s / 1 core & L. Du, X. Ye, X. Tan, J. Feng, Z. Xu, E. Ding and S. Wen: Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.\\
HotSpotNet & & 88.09 \% & 94.06 \% & 83.24 \% & 0.04 s / 1 core & Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille: object as hotspots. Proceedings of the European Conference on Computer Vision (ECCV) 2020.\\
Faraway-Frustum & la & 88.08 \% & 91.90 \% & 85.35 \% & 0.1 s / GPU & H. Zhang, D. Yang, E. Yurtsever, K. Redmill and U. Ozguner: Faraway-frustum: Dealing with lidar sparsity for 3D object detection using fusion. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) 2021.\\
SFEBEV & & 88.08 \% & 93.44 \% & 83.01 \% & 0.01 s / 1 core & \\
pointpillar\_spark\_fo & & 88.02 \% & 92.48 \% & 84.82 \% & 0.1 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
UberATG-HDNET & la & 87.98 \% & 93.13 \% & 81.23 \% & 0.05 s / GPU & B. Yang, M. Liang and R. Urtasun: HDNET: Exploiting HD Maps for 3D Object Detection. 2nd Conference on Robot Learning (CoRL) 2018.\\
spark\_pointpillar2 & & 87.93 \% & 92.74 \% & 84.70 \% & 10 s / 1 core & \\
BAPartA2S-4h & & 87.89 \% & 91.96 \% & 83.31 \% & 0.1 s / 1 core & \\
Fast Point R-CNN & la & 87.84 \% & 90.87 \% & 80.52 \% & 0.06 s / GPU & Y. Chen, S. Liu, X. Shen and J. Jia: Fast Point R-CNN. Proceedings of the IEEE international conference on computer vision (ICCV) 2019.\\
MMLab-PartA^2 & la & 87.79 \% & 91.70 \% & 84.61 \% & 0.08 s / GPU & S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.\\
SIF & & 87.76 \% & 91.44 \% & 85.15 \% & 0.1 s / 1 core & P. An: SIF. Submitted to CVIU 2021.\\
MVAF-Net & & 87.73 \% & 91.95 \% & 85.00 \% & 0.06 s / 1 core & G. Wang, B. Tian, Y. Zhang, L. Chen, D. Cao and J. Wu: Multi-View Adaptive Fusion Network for 3D Object Detection. arXiv preprint arXiv:2011.00652 2020.\\
DVFENet & & 87.68 \% & 90.93 \% & 84.60 \% & 0.05 s / 1 core & Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang: DVFENet: Dual-branch Voxel Feature Extraction Network for 3D Object Detection. Neurocomputing 2021.\\
S-AT GCN & & 87.68 \% & 90.85 \% & 84.20 \% & 0.02 s / GPU & L. Wang, C. Wang, X. Zhang, T. Lan and J. Li: S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection. CoRR 2021.\\
RangeDet (Official) & & 87.67 \% & 90.93 \% & 82.92 \% & 0.02 s / 1 core & L. Fan, X. Xiong, F. Wang, N. Wang and Z. Zhang: RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.\\
pointpillar\_baseline & & 87.61 \% & 92.52 \% & 83.84 \% & 0.01 s / 1 core & \\
Second\_baseline & & 87.60 \% & 90.94 \% & 84.36 \% & 0.03 s / 1 core & \\
VoxelFSD-S & & 87.60 \% & 90.94 \% & 84.11 \% & 0.05 s / 1 core & \\
SC-SSD & & 87.56 \% & 90.70 \% & 84.36 \% & 1 s / 1 core & \\
MODet & la & 87.56 \% & 90.80 \% & 82.69 \% & 0.05 s / & Y. Zhang, Z. Xiang, C. Qiao and S. Chen: Accurate and Real-Time Object Detection Based on Bird's Eye View on 3D Point Clouds. 2019 International Conference on 3D Vision (3DV) 2019.\\
TF-PartA2 & & 87.54 \% & 91.93 \% & 83.33 \% & 0.1 s / 1 core & \\
AB3DMOT & la on & 87.53 \% & 91.99 \% & 81.03 \% & 0.0047s / 1 core & X. Weng and K. Kitani: A Baseline for 3D Multi-Object Tracking. arXiv:1907.03961 2019.\\
mm3d\_PartA2 & & 87.51 \% & 91.75 \% & 83.01 \% & 0.1 s / GPU & \\
SeSame-point & & 87.49 \% & 90.84 \% & 83.77 \% & N/A s / TITAN RTX & \\
PointRGCN & & 87.49 \% & 91.63 \% & 80.73 \% & 0.26 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
MGAF-3DSSD & & 87.47 \% & 92.70 \% & 82.19 \% & 0.1 s / 1 core & J. Li, H. Dai, L. Shao and Y. Ding: Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.\\
PC-CNN-V2 & la & 87.40 \% & 91.19 \% & 79.35 \% & 0.5 s / GPU & X. Du, M. Ang, S. Karaman and D. Rus: A General Pipeline for 3D Detection of Vehicles. 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018.\\
MMLab-PointRCNN & la & 87.39 \% & 92.13 \% & 82.72 \% & 0.1 s / GPU & S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.\\
Sem-Aug & la & 87.37 \% & 93.35 \% & 82.43 \% & 0.1 s / GPU & L. Zhao, M. Wang and Y. Yue: Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection. IEEE Robotics and Automation Letters 2022.\\
MAFF-Net(DAF-Pillar) & & 87.34 \% & 90.79 \% & 77.66 \% & 0.04 s / 1 core & Z. Zhang, Z. Liang, M. Zhang, X. Zhao, Y. Ming, T. Wenming and S. Pu: MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion. arXiv preprint arXiv:2009.10945 2020.\\
Harmonic PointPillar & & 87.28 \% & 90.89 \% & 82.54 \% & 0.01 s / 1 core & H. Zhang, J. Mekala, Z. Nain, J. Park and H. Jung: 3D Harmonic Loss: Towards Task-consistent and Time-friendly 3D Object Detection for V2X Orchestration. will submit to IEEE Transactions on Vehicular Technology 2022.\\
PASS-PointPillar & & 87.23 \% & 91.07 \% & 81.98 \% & 1 s / 1 core & Anonymous: Leveraging Anchor-based LiDAR 3D Object Detection via Point Assisted Sample Selection. will submit to computer vision conference/journal 2024.\\
HRI-VoxelFPN & & 87.21 \% & 92.75 \% & 79.82 \% & 0.02 s / GPU & H. Kuang, B. Wang, J. An, M. Zhang and Z. Zhang: Voxel-FPN:multi-scale voxel feature aggregation in 3D object detection from point clouds. sensors 2020.\\
epBRM & la & 87.13 \% & 90.70 \% & 81.92 \% & 0.1 s / GPU & K. Shin: Improving a Quality of 3D Object Detection by Spatial Transformation Mechanism. arXiv preprint arXiv:1910.04853 2019.\\
LVFSD & & 87.12 \% & 90.42 \% & 83.91 \% & 0.06 s / & ERROR: Wrong syntax in BIBTEX file.\\
XT-PartA2 & & 87.08 \% & 90.89 \% & 82.70 \% & 0.1 s / GPU & \\
centerpoint\_pcdet & & 87.04 \% & 90.04 \% & 83.32 \% & 0.06 s / 1 core & \\
SARPNET & & 86.92 \% & 92.21 \% & 81.68 \% & 0.05 s / 1 core & Y. Ye, H. Chen, C. Zhang, X. Hao and Z. Zhang: SARPNET: Shape Attention Regional Proposal Network for LiDAR-based 3D Object Detection. Neurocomputing 2019.\\
SeSame-pillar & & 86.88 \% & 90.61 \% & 81.93 \% & N/A s / TITAN RTX & \\
ARPNET & & 86.81 \% & 90.06 \% & 79.41 \% & 0.08 s / GPU & Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network for 3D object detection. Science China Information Sciences 2019.\\
C-GCN & & 86.78 \% & 91.11 \% & 80.09 \% & 0.147 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
PointPillars & la & 86.56 \% & 90.07 \% & 82.81 \% & 16 ms / & A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from Point Clouds. CVPR 2019.\\
TANet & & 86.54 \% & 91.58 \% & 81.19 \% & 0.035s / GPU & Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from Point Clouds with Triple Attention. AAAI 2020.\\
PCNet3D & & 86.54 \% & 90.09 \% & 81.43 \% & 0.05 s / GPU & \\
MEDL-U & & 86.50 \% & 91.75 \% & 79.29 \% & 1 s / GPU & \\
SCNet & la & 86.48 \% & 90.07 \% & 81.30 \% & 0.04 s / GPU & Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.\\
MM\_SECOND & & 86.39 \% & 90.52 \% & 81.49 \% & 0.05 s / GPU & \\
SegVoxelNet & & 86.37 \% & 91.62 \% & 83.04 \% & 0.04 s / 1 core & H. Yi, S. Shi, M. Ding, J. Sun, K. Xu, H. Zhou, Z. Wang, S. Li and G. Wang: SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud. ICRA 2020.\\
IIOU\_LDR & & 86.31 \% & 91.80 \% & 81.30 \% & 0.03 s / 1 core & \\
3D IoU Loss & la & 86.22 \% & 91.36 \% & 81.20 \% & 0.08 s / GPU & D. Zhou, J. Fang, X. Song, C. Guan, J. Yin, Y. Dai and R. Yang: IoU Loss for 2D/3D Object Detection. International Conference on 3D Vision (3DV) 2019.\\
VSAC & & 86.22 \% & 91.98 \% & 81.50 \% & 0.07 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
prcnn\_v18\_80\_100 & & 86.20 \% & 90.79 \% & 81.39 \% & 0.1 s / 1 core & \\
voxelnext\_pcdet & & 86.15 \% & 89.72 \% & 82.34 \% & 0.05 s / 1 core & \\
ROT\_S3D & & 86.11 \% & 91.33 \% & 81.17 \% & 0.1 s / GPU & \\
SeSame-pillar w/scor & & 86.11 \% & 90.43 \% & 81.38 \% & N/A s / 1 core & \\
R50\_SACINet & & 86.10 \% & 91.70 \% & 83.15 \% & 0.06 s / 1 core & \\
R-GCN & & 86.05 \% & 91.91 \% & 81.05 \% & 0.16 s / GPU & J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.\\
UberATG-PIXOR++ & la & 86.01 \% & 93.28 \% & 80.11 \% & 0.035 s / GPU & B. Yang, M. Liang and R. Urtasun: HDNET: Exploiting HD Maps for 3D Object Detection. 2nd Conference on Robot Learning (CoRL) 2018.\\
HINTED & & 86.01 \% & 90.61 \% & 79.29 \% & 0.04 s / 1 core & \\
L\_SACINet & & 85.99 \% & 91.21 \% & 81.05 \% & 0.04 s / 1 core & \\
PL++: PV-RCNN++ & st la & 85.89 \% & 91.76 \% & 81.29 \% & 0.342 s / & \\
Sem-Aug-PointRCNN++ & & 85.88 \% & 91.68 \% & 83.37 \% & 0.1 s / 8 cores & L. Zhao, M. Wang and Y. Yue: Sem-Aug: Improving Camera-LiDAR Feature Fusion With Semantic Augmentation for 3D Vehicle Detection. IEEE Robotics and Automation Letters 2022.\\
DASS & & 85.85 \% & 91.74 \% & 80.97 \% & 0.09 s / 1 core & O. Unal, L. Van Gool and D. Dai: Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021.\\
F-ConvNet & la & 85.84 \% & 91.51 \% & 76.11 \% & 0.47 s / GPU & Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. IROS 2019.\\
SecAtten & & 85.84 \% & 91.32 \% & 82.57 \% & 0.1 s / 1 core & \\
PI-RCNN & & 85.81 \% & 91.44 \% & 81.00 \% & 0.1 s / 1 core & L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai and X. He: PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module. AAAI 2020 : The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020.\\
PointRGBNet & & 85.73 \% & 91.39 \% & 80.68 \% & 0.08 s / 4 cores & P. Xie Desheng: Real-time Detection of 3D Objects Based on Multi-Sensor Information Fusion. Automotive Engineering 2022.\\
SeSame-voxel & & 85.62 \% & 89.86 \% & 80.95 \% & N/A s / TITAN RTX & \\
WA & & 85.61 \% & 90.76 \% & 79.99 \% & 0.3 s / GPU & \\
UberATG-ContFuse & la & 85.35 \% & 94.07 \% & 75.88 \% & 0.06 s / GPU & M. Liang, B. Yang, S. Wang and R. Urtasun: Deep Continuous Fusion for Multi-Sensor 3D Object Detection. ECCV 2018.\\
PFF3D & la & 85.08 \% & 89.61 \% & 80.42 \% & 0.05 s / GPU & L. Wen and K. Jo: Fast and Accurate 3D Object Detection for Lidar-Camera-Based Autonomous Vehicles Using One Shared Voxel-Based Backbone. IEEE Access 2021.\\
AVOD & la & 84.95 \% & 89.75 \% & 78.32 \% & 0.08 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.\\
WS3D & la & 84.93 \% & 90.96 \% & 77.96 \% & 0.1 s / GPU & Q. Meng, W. Wang, T. Zhou, J. Shen, L. Van Gool and D. Dai: Weakly Supervised 3D Object Detection from Lidar Point Cloud. 2020.\\
PI-SECOND & & 84.83 \% & 90.15 \% & 79.86 \% & 0.05 s / GPU & \\
ODGS & & 84.82 \% & 89.59 \% & 78.20 \% & 0.1 s / 1 core & \\
AVOD-FPN & la & 84.82 \% & 90.99 \% & 79.62 \% & 0.1 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.\\
F-PointNet & la & 84.67 \% & 91.17 \% & 74.77 \% & 0.17 s / GPU & C. Qi, W. Liu, C. Wu, H. Su and L. Guibas: Frustum PointNets for 3D Object Detection from RGB-D Data. arXiv preprint arXiv:1711.08488 2017.\\
AEPF & & 84.63 \% & 89.99 \% & 80.02 \% & 0.05 s / GPU & \\
mmFUSION & & 84.60 \% & 90.35 \% & 79.82 \% & 1s / 1 core & J. Ahmad and A. Del Bue: mmFUSION: Multimodal Fusion for 3D Objects Detection. arXiv preprint arXiv:2311.04058 2023.\\
3DBN & la & 83.94 \% & 89.66 \% & 76.50 \% & 0.13s / & X. Li, J. Guivant, N. Kwok and Y. Xu: 3D Backbone Network for 3D Object Detection. CoRR 2019.\\
EOTL & & 83.14 \% & 89.10 \% & 71.41 \% & TBD s / 1 core & R. Yang, Z. Yan, T. Yang, Y. Wang and Y. Ruichek: Efficient Online Transfer Learning for Road Participants Detection in Autonomous Driving. IEEE Sensors Journal 2023.\\
MLOD & la & 82.68 \% & 90.25 \% & 77.97 \% & 0.12 s / GPU & J. Deng and K. Czarnecki: MLOD: A multi-view 3D object detection based on robust feature fusion method. arXiv preprint arXiv:1909.04163 2019.\\
BirdNet+ & la & 81.85 \% & 87.43 \% & 75.36 \% & 0.11 s / & A. Barrera, J. Beltrán, C. Guindel, J. Iglesias and F. García: BirdNet+: Two-Stage 3D Object Detection in LiDAR through a Sparsity-Invariant Bird’s Eye View. IEEE Access 2021.\\
DMF & st & 80.29 \% & 84.64 \% & 76.05 \% & 0.2 s / 1 core & X. J. Chen and W. Xu: Disparity-Based Multiscale Fusion Network for Transportation Detection. IEEE Transactions on Intelligent Transportation Systems 2022.\\
UberATG-PIXOR & la & 80.01 \% & 83.97 \% & 74.31 \% & 0.035 s / & B. Yang, W. Luo and R. Urtasun: PIXOR: Real-time 3D Object Detection from Point Clouds. CVPR 2018.\\
MV3D (LIDAR) & la & 78.98 \% & 86.49 \% & 72.23 \% & 0.24 s / GPU & X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.\\
DSGN++ & st & 78.94 \% & 88.55 \% & 69.74 \% & 0.2 s / & Y. Chen, S. Huang, S. Liu, B. Yu and J. Jia: DSGN++: Exploiting Visual-Spatial Relation for Stereo-Based 3D Detectors. IEEE Transactions on Pattern Analysis and Machine Intelligence 2022.\\
MV3D & la & 78.93 \% & 86.62 \% & 69.80 \% & 0.36 s / GPU & X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.\\
StereoDistill & & 78.59 \% & 89.03 \% & 69.34 \% & 0.4 s / 1 core & Z. Liu, X. Ye, X. Tan, D. Errui, Y. Zhou and X. Bai: StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-based 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2023.\\
MMLAB LIGA-Stereo & st & 76.78 \% & 88.15 \% & 67.40 \% & 0.4 s / 1 core & X. Guo, S. Shi, X. Wang and H. Li: LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.\\
RCD & & 75.83 \% & 82.26 \% & 69.61 \% & 0.1 s / GPU & A. Bewley, P. Sun, T. Mensink, D. Anguelov and C. Sminchisescu: Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection. Conference on Robot Learning (CoRL) 2020.\\
LaserNet & & 74.52 \% & 79.19 \% & 68.45 \% & 12 ms / GPU & G. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez and C. Wellington: LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
PL++ (SDN+GDC) & st la & 73.80 \% & 84.61 \% & 65.59 \% & 0.6 s / GPU & Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.\\
SNVC & st & 73.61 \% & 86.88 \% & 64.49 \% & 1 s / GPU & S. Li, Z. Liu, Z. Shen and K. Cheng: Stereo Neural Vernier Caliper. Proceedings of the AAAI Conference on Artificial Intelligence 2022.\\
A3DODWTDA & la & 73.26 \% & 79.58 \% & 62.77 \% & 0.08 s / GPU & F. Gustafsson and E. Linder-Norén: Automotive 3D Object Detection Without Target Domain Annotations. 2018.\\
Complexer-YOLO & la & 68.96 \% & 77.24 \% & 64.95 \% & 0.06 s / GPU & M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019.\\
TopNet-Retina & la & 68.16 \% & 80.16 \% & 63.43 \% & 52ms / & S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.\\
SeSame-point w/score & & 67.18 \% & 83.44 \% & 57.68 \% & N/A s / 1 core & \\
SeSame-point w/score & & 67.18 \% & 83.44 \% & 57.68 \% & N/A s / GPU & \\
CG-Stereo & st & 66.44 \% & 85.29 \% & 58.95 \% & 0.57 s / & C. Li, J. Ku and S. Waslander: Confidence Guided Stereo 3D Object Detection with Split Depth Estimation. IROS 2020.\\
PLUME & st & 66.27 \% & 82.97 \% & 56.70 \% & 0.15 s / GPU & Y. Wang, B. Yang, R. Hu, M. Liang and R. Urtasun: PLUME: Efficient 3D Object Detection from Stereo Images. IROS 2021.\\
CDN & st & 66.24 \% & 83.32 \% & 57.65 \% & 0.6 s / GPU & D. Garg, Y. Wang, B. Hariharan, M. Campbell, K. Weinberger and W. Chao: Wasserstein Distances for Stereo Disparity Estimation. Advances in Neural Information Processing Systems (NeurIPS) 2020.\\
DSGN & st & 65.05 \% & 82.90 \% & 56.60 \% & 0.67 s / & Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D Object Detection. CVPR 2020.\\
TopNet-DecayRate & la & 64.60 \% & 79.74 \% & 58.04 \% & 92 ms / & S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.\\
SeSame-voxel w/score & & 63.36 \% & 71.98 \% & 57.52 \% & N/A s / GPU & \\
BirdNet+ (legacy) & la & 63.33 \% & 84.80 \% & 61.23 \% & 0.1 s / & A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird’s Eye View. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.\\
3D FCN & la & 61.67 \% & 70.62 \% & 55.61 \% & >5 s / 1 core & B. Li: 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. IROS 2017.\\
CDN-PL++ & st & 61.04 \% & 81.27 \% & 52.84 \% & 0.4 s / GPU & D. Garg, Y. Wang, B. Hariharan, M. Campbell, K. Weinberger and W. Chao: Wasserstein Distances for Stereo Disparity Estimation. Advances in Neural Information Processing Systems 2020.\\
BirdNet & la & 59.83 \% & 84.17 \% & 57.35 \% & 0.11 s / & J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework from LiDAR Information. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.\\
TopNet-UncEst & la & 59.67 \% & 72.05 \% & 51.67 \% & 0.09 s / & S. Wirges, M. Braun, M. Lauer and C. Stiller: Capturing Object Detection Uncertainty in Multi-Layer Grid Maps. 2019.\\
RT3D-GMP & st & 59.00 \% & 69.14 \% & 45.49 \% & 0.06 s / GPU & H. Königshof and C. Stiller: Learning-Based Shape Estimation with Grid Map Patches for Realtime 3D Object Detection for Automated Driving. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.\\
Disp R-CNN (velo) & st & 58.62 \% & 79.76 \% & 47.73 \% & 0.387 s / GPU & J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.\\
ESGN & st & 58.12 \% & 78.10 \% & 49.28 \% & 0.06 s / GPU & A. Gao, Y. Pang, J. Nie, Z. Shao, J. Cao, Y. Guo and X. Li: ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2022.\\
Pseudo-LiDAR++ & st & 58.01 \% & 78.31 \% & 51.25 \% & 0.4 s / GPU & Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.\\
Disp R-CNN & st & 57.98 \% & 79.61 \% & 47.09 \% & 0.387 s / GPU & J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.\\
ZoomNet & st & 54.91 \% & 72.94 \% & 44.14 \% & 0.3 s / 1 core & L. Z. Xu: ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2020.\\
VoxelJones & & 53.96 \% & 66.21 \% & 47.66 \% & .18 s / 1 core & M. Motro and J. Ghosh: Vehicular Multi-object Tracking with Persistent Detector Failures. arXiv preprint arXiv:1907.11306 2019.\\
TopNet-HighRes & la & 53.05 \% & 67.84 \% & 46.99 \% & 101ms / & S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.\\
OC Stereo & st & 51.47 \% & 68.89 \% & 42.97 \% & 0.35 s / 1 core & A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D Object Detection. ICRA 2020.\\
YOLOStereo3D & st & 50.28 \% & 76.10 \% & 36.86 \% & 0.1 s / & Y. Liu, L. Wang and M. Liu: YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. 2021 International Conference on Robotics and Automation (ICRA) 2021.\\
SST [st] & st & 47.07 \% & 71.08 \% & 41.90 \% & 1 s / 1 core & \\
RT3DStereo & st & 46.82 \% & 58.81 \% & 38.38 \% & 0.08 s / GPU & H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.\\
Pseudo-Lidar & st & 45.00 \% & 67.30 \% & 38.40 \% & 0.4 s / GPU & Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
RT3D & la & 44.00 \% & 56.44 \% & 42.34 \% & 0.09 s / GPU & Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li and N. Sun: RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving. IEEE Robotics and Automation Letters 2018.\\
Stereo CenterNet & st & 42.12 \% & 62.97 \% & 35.37 \% & 0.04 s / GPU & Y. Shi, Y. Guo, Z. Mi and X. Li: Stereo CenterNet-based 3D object detection for autonomous driving. Neurocomputing 2022.\\
Stereo R-CNN & st & 41.31 \% & 61.92 \% & 33.42 \% & 0.3 s / GPU & P. Li, X. Chen and S. Shen: Stereo R-CNN based 3D Object Detection for Autonomous Driving. CVPR 2019.\\
MonoTRKDv2 & & 36.89 \% & 46.87 \% & 33.39 \% & 40 s / 1 core & \\
DA3D+KM3D+v2-99 & & 34.88 \% & 44.27 \% & 30.29 \% & 0.120s / GPU & Y. Jia, J. Wang, H. Pan and W. Sun: Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies. IEEE Transactions on Instrumentation and Measurement 2024.\\
CIE + DM3D & & 33.13 \% & 46.17 \% & 28.80 \% & 0.1 s / 1 core & Ananimities: Consistency of Implicit and Explicit Features Matters for Monocular 3D Object Detection. arXiv preprint arXiv:2207.07933 2022.\\
StereoFENet & st & 32.96 \% & 49.29 \% & 25.90 \% & 0.15 s / 1 core & W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.\\
MonoTAKD V2 & & 32.31 \% & 43.83 \% & 28.48 \% & 0.1 s / 1 core & \\
error & & 30.78 \% & 49.96 \% & 26.51 \% & 1 s / 1 core & \\
monodetrnext-a & & 30.68 \% & 37.32 \% & 31.29 \% & 0.04 s / 1 core & \\
Mobile Stereo R-CNN & st & 28.78 \% & 44.51 \% & 22.30 \% & 1.8 s / & M. Hussein, M. Khalil and B. Abdullah: 3D Object Detection using Mobile Stereo R- CNN on Nvidia Jetson TX2. International Conference on Advanced Engineering, Technology and Applications (ICAETA) 2021.\\
DA3D+KM3D & & 28.71 \% & 39.50 \% & 25.20 \% & 0.02 s / GPU & Y. Jia, J. Wang, H. Pan and W. Sun: Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies. IEEE Transactions on Instrumentation and Measurement 2024.\\
CIE & & 28.50 \% & 41.41 \% & 23.88 \% & 0.1 s / 1 core & Anonymities: Consistency of Implicit and Explicit Features Matters for Monocular 3D Object Detection. arXiv preprint arXiv:2207.07933 2022.\\
monodetrnext-f & & 28.12 \% & 34.56 \% & 28.33 \% & 0.03 s / GPU & \\
MonoTAKD & & 27.76 \% & 38.75 \% & 24.14 \% & 0.1 s / 1 core & \\
MonoLTKD & & 27.76 \% & 38.75 \% & 24.14 \% & 0.04 s / 1 core & \\
MonoLTKD\_V3 & & 27.75 \% & 38.75 \% & 24.13 \% & 0.04 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
DA3D & & 26.92 \% & 36.83 \% & 23.41 \% & 0.03 s / 1 core & Y. Jia, J. Wang, H. Pan and W. Sun: Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies. IEEE Transactions on Instrumentation and Measurement 2024.\\
MonoLiG & & 26.83 \% & 35.73 \% & 24.24 \% & 0.03 s / 1 core & A. Hekimoglu, M. Schmidt and A. Ramiro: Monocular 3D Object Detection with LiDAR Guided Semi Supervised Active Learning. 2023.\\
ZJC & & 26.21 \% & 41.36 \% & 22.64 \% & 0.5 s / 1 core & \\
Sample & & 26.21 \% & 35.31 \% & 22.28 \% & 0.01 s / 1 core & \\
MonoLSS & & 25.95 \% & 34.89 \% & 22.59 \% & 0.04 s / 1 core & Z. Li, J. Jia and Y. Shi: MonoLSS: Learnable Sample Selection For Monocular 3D Detection. International Conference on 3D Vision 2024.\\
CMKD & & 25.82 \% & 38.98 \% & 22.80 \% & 0.1 s / 1 core & Y. Hong, H. Dai and Y. Ding: Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection. ECCV 2022.\\
Occlude3D & & 25.41 \% & 33.08 \% & 20.75 \% & 0.01 s / 1 core & \\
PS-SVDM & & 24.82 \% & 38.18 \% & 20.89 \% & 1 s / 1 core & Y. Shi: SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection. arXiv preprint arXiv:2307.02270 2023.\\
LPCG-Monoflex & & 24.81 \% & 35.96 \% & 21.86 \% & 0.03 s / 1 core & L. Peng, F. Liu, Z. Yu, S. Yan, D. Deng, Z. Yang, H. Liu and D. Cai: Lidar Point Cloud Guided Monocular 3D Object Detection. ECCV 2022.\\
NeurOCS & & 24.49 \% & 37.27 \% & 20.89 \% & 0.1 s / GPU & Z. Min, B. Zhuang, S. Schulter, B. Liu, E. Dunn and M. Chandraker: NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization. CVPR 2023.\\
Mix-Teaching & & 24.23 \% & 35.74 \% & 20.80 \% & 30 s / 1 core & L. Yang, X. Zhang, L. Wang, M. Zhu, C. Zhang and J. Li: Mix-Teaching: A Simple, Unified and Effective Semi-Supervised Learning Framework for Monocular 3D Object Detection. ArXiv 2022.\\
MonoAux-v2 & & 24.15 \% & 34.14 \% & 20.84 \% & 0.04 s / GPU & \\
MonoSKD & & 24.08 \% & 37.12 \% & 20.37 \% & 0.04 s / 1 core & S. Wang and J. Zheng: MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient. ECAI 2023.\\
MonoSample (DID-M3D) & & 23.94 \% & 37.64 \% & 20.46 \% & 0.2 s / 1 core & J. Qiao, B. Liu, J. Yang, B. Wang, S. Xiu, X. Du and X. Nie: MonoSample: Synthetic 3D Data Augmentation Method in Monocular 3D Object Detection. IEEE Robotics and Automation Letters 2024.\\
TBD & & 23.87 \% & 37.10 \% & 20.24 \% & 0.04 s / 1 core & \\
PS-fld & & 23.76 \% & 32.64 \% & 20.64 \% & 0.25 s / 1 core & Y. Chen, H. Dai and Y. Ding: Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.\\
MSFENet & & 23.65 \% & 36.81 \% & 20.06 \% & 0.1 s / 1 core & \\
SHUD & & 23.63 \% & 36.39 \% & 20.01 \% & 0.04 s / 1 core & \\
ADD & & 23.58 \% & 35.20 \% & 20.08 \% & 0.1 s / 1 core & Z. Wu, Y. Wu, J. Pu, X. Li and X. Wang: Attention-based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection. AAAI2023 .\\
MonoNeRD & & 23.46 \% & 31.13 \% & 20.97 \% & na s / 1 core & J. Xu, L. Peng, H. Cheng, H. Li, W. Qian, K. Li, W. Wang and D. Cai: MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection. ICCV 2023.\\
MonoDDE & & 23.46 \% & 33.58 \% & 20.37 \% & 0.04 s / 1 core & Z. Li, Z. Qu, Y. Zhou, J. Liu, H. Wang and L. Jiang: Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection. CVPR 2022.\\
DD3D & & 23.41 \% & 32.35 \% & 20.42 \% & n/a s / 1 core & D. Park, R. Ambrus, V. Guizilini, J. Li and A. Gaidon: Is Pseudo-Lidar needed for Monocular 3D Object detection?. IEEE/CVF International Conference on Computer Vision (ICCV) .\\
MonoSGC & & 23.27 \% & 35.78 \% & 19.92 \% & 0.04 s / 1 core & \\
FDGNet & & 23.27 \% & 36.25 \% & 19.56 \% & 0.1 s / 1 core & \\
MonoUNI & & 23.05 \% & 33.28 \% & 19.39 \% & 0.04 s / 1 core & J. Jia, Z. Li and Y. Shi: MonoUNI: A Unified Vehicle and Infrastructure-side Monocular 3D Object Detection Network with Sufficient Depth Clues. Thirty-seventh Conference on Neural Information Processing Systems 2023.\\
zqd & & 23.00 \% & 35.01 \% & 20.99 \% & 0.2 s / 1 core & \\
ZQD & & 22.86 \% & 38.51 \% & 19.26 \% & 0.1 s / 1 core & \\
MonoCD & & 22.81 \% & 33.41 \% & 19.57 \% & n/a s / 1 core & L. Yan, P. Yan, S. Xiong, X. Xiang and Y. Tan: MonoCD: Monocular 3D Object Detection with Complementary Depths. CVPR 2024.\\
MonoFRD & & 22.77 \% & 29.65 \% & 20.41 \% & 0.01 s / 1 core & \\
DID-M3D & & 22.76 \% & 32.95 \% & 19.83 \% & 0.04 s / 1 core & L. Peng, X. Wu, Z. Yang, H. Liu and D. Cai: DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection. ECCV 2022.\\
OPA-3D & & 22.53 \% & 33.54 \% & 19.22 \% & 0.04 s / 1 core & Y. Su, Y. Di, G. Zhai, F. Manhardt, J. Rambach, B. Busam, D. Stricker and F. Tombari: OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object Detection. IEEE Robotics and Automation Letters 2023.\\
DCD & & 21.50 \% & 32.55 \% & 18.25 \% & 0.03 s / 1 core & Y. Li, Y. Chen, J. He and Z. Zhang: Densely Constrained Depth Estimator for Monocular 3D Object Detection. European Conference on Computer Vision 2022.\\
MonoDETR & & 21.45 \% & 32.20 \% & 18.68 \% & 0.04 s / 1 core & R. Zhang, H. Qiu, T. Wang, X. Xu, Z. Guo, Y. Qiao, P. Gao and H. Li: MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection. arXiv preprint arXiv:2203.13310 2022.\\
SGM3D & & 21.37 \% & 31.49 \% & 18.43 \% & 0.03 s / 1 core & Z. Zhou, L. Du, X. Ye, Z. Zou, X. Tan, L. Zhang, X. Xue and J. Feng: SGM3D: Stereo Guided Monocular 3D Object Detection. RA-L 2022.\\
Cube R-CNN & & 21.20 \% & 31.70 \% & 18.43 \% & 0.05 s / GPU & G. Brazil, A. Kumar, J. Straub, N. Ravi, J. Johnson and G. Gkioxari: Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild. CVPR 2023.\\
GUPNet & & 21.19 \% & 30.29 \% & 18.20 \% & NA s / 1 core & Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan and W. Ouyang: Geometry Uncertainty Projection Network for Monocular 3D Object Detection. arXiv preprint arXiv:2107.13774 2021.\\
MonoSIM\_v2 & & 21.19 \% & 30.36 \% & 18.45 \% & 0.03 s / 1 core & \\
HomoLoss(monoflex) & & 20.68 \% & 29.60 \% & 17.81 \% & 0.04 s / 1 core & J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homography Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.\\
DEVIANT & & 20.44 \% & 29.65 \% & 17.43 \% & 0.04 s / & A. Kumar, G. Brazil, E. Corona, A. Parchami and X. Liu: DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection. European Conference on Computer Vision (ECCV) 2022.\\
MonoDTR & & 20.38 \% & 28.59 \% & 17.14 \% & 0.04 s / 1 core & K. Huang, T. Wu, H. Su and W. Hsu: MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer. CVPR 2022.\\
MDSNet & & 20.14 \% & 32.81 \% & 15.77 \% & 0.05 s / 1 core & Z. Xie, Y. Song, J. Wu, Z. Li, C. Song and Z. Xu: MDS-Net: Multi-Scale Depth Stratification 3D Object Detection from Monocular Images. Sensors 2022.\\
MonoSIM & & 20.09 \% & 28.68 \% & 18.28 \% & 0.16 s / 1 core & \\
AutoShape & & 20.08 \% & 30.66 \% & 15.95 \% & 0.04 s / 1 core & Z. Liu, D. Zhou, F. Lu, J. Fang and L. Zhang: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.\\
MonoFlex & & 19.75 \% & 28.23 \% & 16.89 \% & 0.03 s / GPU & Y. Zhang, J. Lu and J. Zhou: Objects are Different: Flexible Monocular 3D Object Detection. CVPR 2021.\\
MonoEF & & 19.70 \% & 29.03 \% & 17.26 \% & 0.03 s / 1 core & Y. Zhou, Y. He, H. Zhu, C. Wang, H. Li and Q. Jiang: Monocular 3D Object Detection: An Extrinsic Parameter Free Approach. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021.\\
MonOAPC & & 19.67 \% & 28.91 \% & 16.99 \% & 0035 s / 1 core & H. Yao, J. Chen, Z. Wang, X. Wang, P. Han, X. Chai and Y. Qiu: Occlusion-Aware Plane-Constraints for Monocular 3D Object Detection. IEEE Transactions on Intelligent Transportation Systems 2023.\\
MonoDSSMs-M & & 19.59 \% & 28.29 \% & 16.34 \% & 0.02 s / 1 core & \\
MonoDSSMs-A & & 19.54 \% & 28.84 \% & 16.30 \% & 0.02 s / 1 core & \\
HomoLoss(imvoxelnet) & & 19.25 \% & 29.18 \% & 16.21 \% & 0.20 s / 1 core & J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homogrpahy Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.\\
DFR-Net & & 19.17 \% & 28.17 \% & 14.84 \% & 0.18 s / & Z. Zou, X. Ye, L. Du, X. Cheng, X. Tan, L. Zhang, J. Feng, X. Xue and E. Ding: The devil is in the task: Exploiting reciprocal appearance-localization features for monocular 3d object detection . ICCV 2021.\\
PS-SVDM & & 19.07 \% & 28.52 \% & 16.30 \% & 1 s / 1 core & Y. Shi: SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection. arXiv preprint arXiv:2307.02270 2023.\\
DLE & & 19.05 \% & 31.09 \% & 14.13 \% & 0.06 s / & C. Liu, S. Gu, L. Gool and R. Timofte: Deep Line Encoding for Monocular 3D Object Detection and Depth Prediction. Proceedings of the British Machine Vision Conference (BMVC) 2021.\\
PCT & & 19.03 \% & 29.65 \% & 15.92 \% & 0.045 s / 1 core & L. Wang, L. Zhang, Y. Zhu, Z. Zhang, T. He, M. Li and X. Xue: Progressive Coordinate Transforms for Monocular 3D Object Detection. NeurIPS 2021.\\
CaDDN & & 18.91 \% & 27.94 \% & 17.19 \% & 0.63 s / GPU & C. Reading, A. Harakeh, J. Chae and S. Waslander: Categorical Depth Distribution Network for Monocular 3D Object Detection. CVPR 2021.\\
monodle & & 18.89 \% & 24.79 \% & 16.00 \% & 0.04 s / GPU & X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li and W. Ouyang: Delving into Localization Errors for Monocular 3D Object Detection. CVPR 2021 .\\
Neighbor-Vote & & 18.65 \% & 27.39 \% & 16.54 \% & 0.1 s / GPU & X. Chu, J. Deng, Y. Li, Z. Yuan, Y. Zhang, J. Ji and Y. Zhang: Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting. ACM MM 2021.\\
MonoRCNN++ & & 18.62 \% & 27.20 \% & 15.69 \% & 0.07 s / GPU & X. Shi, Z. Chen and T. Kim: Multivariate Probabilistic Monocular 3D Object Detection. WACV 2023.\\
GrooMeD-NMS & & 18.27 \% & 26.19 \% & 14.05 \% & 0.12 s / 1 core & A. Kumar, G. Brazil and X. Liu: GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection. CVPR 2021.\\
MonoRCNN & & 18.11 \% & 25.48 \% & 14.10 \% & 0.07 s / GPU & X. Shi, Q. Ye, X. Chen, C. Chen, Z. Chen and T. Kim: Geometry-based Distance Decomposition for Monocular 3D Object Detection. ICCV 2021.\\
Ground-Aware & & 17.98 \% & 29.81 \% & 13.08 \% & 0.05 s / 1 core & Y. Liu, Y. Yuan and M. Liu: Ground-aware Monocular 3D Object Detection for Autonomous Driving. IEEE Robotics and Automation Letters 2021.\\
Aug3D-RPN & & 17.89 \% & 26.00 \% & 14.18 \% & 0.08 s / 1 core & C. He, J. Huang, X. Hua and L. Zhang: Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth. 2021.\\
DDMP-3D & & 17.89 \% & 28.08 \% & 13.44 \% & 0.18 s / 1 core & L. Wang, L. Du, X. Ye, Y. Fu, G. Guo, X. Xue, J. Feng and L. Zhang: Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection. CVPR 2020.\\
IAFA & & 17.88 \% & 25.88 \% & 15.35 \% & 0.04 s / 1 core & D. Zhou, X. Song, Y. Dai, J. Yin, F. Lu, M. Liao, J. Fang and L. Zhang: IAFA: Instance-Aware Feature Aggregation for 3D Object Detection from a Single Image. Proceedings of the Asian Conference on Computer Vision 2020.\\
mdab & & 17.74 \% & 26.42 \% & 15.71 \% & 22 s / 1 core & \\
FMF-occlusion-net & & 17.60 \% & 27.39 \% & 13.25 \% & 0.16 s / 1 core & H. Liu, H. Liu, Y. Wang, F. Sun and W. Huang: Fine-grained Multi-level Fusion for Anti- occlusion Monocular 3D Object Detection. IEEE Transactions on Image Processing 2022.\\
RefinedMPL & & 17.60 \% & 28.08 \% & 13.95 \% & 0.15 s / GPU & J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.\\
Kinematic3D & & 17.52 \% & 26.69 \% & 13.10 \% & 0.12 s / 1 core & G. Brazil, G. Pons-Moll, X. Liu and B. Schiele: Kinematic 3D Object Detection in Monocular Video. ECCV 2020 .\\
MonoAuxNorm & & 17.38 \% & 23.43 \% & 14.74 \% & 0.02 s / GPU & \\
MonoRUn & & 17.34 \% & 27.94 \% & 15.24 \% & 0.07 s / GPU & H. Chen, Y. Huang, W. Tian, Z. Gao and L. Xiong: MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.\\
AM3D & & 17.32 \% & 25.03 \% & 14.91 \% & 0.4 s / GPU & X. Ma, Z. Wang, H. Li, P. Zhang, W. Ouyang and X. Fan: Accurate Monocular Object Detection via Color- Embedded 3D Reconstruction for Autonomous Driving. Proceedings of the IEEE international Conference on Computer Vision (ICCV) 2019.\\
YoloMono3D & & 17.15 \% & 26.79 \% & 12.56 \% & 0.05 s / GPU & Y. Liu, L. Wang and L. Ming: YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. 2021 International Conference on Robotics and Automation (ICRA) 2021.\\
CMAN & & 17.04 \% & 25.89 \% & 12.88 \% & 0.15 s / 1 core & C. Yuanzhouhan Cao: CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection. IEEE Trans. Intell. Transport. Syst. 2022.\\
GAC3D & & 16.93 \% & 25.80 \% & 12.50 \% & 0.25 s / 1 core & M. Bui, D. Ngo, H. Pham and D. Nguyen: GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution. 2021.\\
PatchNet & & 16.86 \% & 22.97 \% & 14.97 \% & 0.4 s / 1 core & X. Ma, S. Liu, Z. Xia, H. Zhang, X. Zeng and W. Ouyang: Rethinking Pseudo-LiDAR Representation. Proceedings of the European Conference on Computer Vision (ECCV) 2020.\\
SAKD-MR-Res18 & & 16.56 \% & 26.48 \% & 13.67 \% & 0.03 s / 1 core & \\
PGD-FCOS3D & & 16.51 \% & 26.89 \% & 13.49 \% & 0.03 s / 1 core & T. Wang, X. Zhu, J. Pang and D. Lin: Probabilistic and Geometric Depth: Detecting Objects in Perspective. Conference on Robot Learning (CoRL) 2021.\\
ImVoxelNet & & 16.37 \% & 25.19 \% & 13.58 \% & 0.2 s / GPU & D. Rukhovich, A. Vorontsova and A. Konushin: ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection. arXiv preprint arXiv:2106.01178 2021.\\
KM3D & & 16.20 \% & 23.44 \% & 14.47 \% & 0.03 s / 1 core & P. Li: Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training. 2020.\\
D4LCN & & 16.02 \% & 22.51 \% & 12.55 \% & 0.2 s / GPU & M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for Monocular 3D Object Detection. CVPR 2020.\\
MonoAIU & & 15.68 \% & 23.01 \% & 12.87 \% & 0.03 s / GPU & \\
mdab & & 15.09 \% & 23.18 \% & 13.38 \% & 22 s / 1 core & \\
MonoPair & & 14.83 \% & 19.28 \% & 12.89 \% & 0.06 s / GPU & Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.\\
Decoupled-3D & & 14.82 \% & 23.16 \% & 11.25 \% & 0.08 s / GPU & Y. Cai, B. Li, Z. Jiao, H. Li, X. Zeng and X. Wang: Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation. AAAI 2020.\\
QD-3DT & on & 14.71 \% & 20.16 \% & 12.76 \% & 0.03 s / GPU & H. Hu, Y. Yang, T. Fischer, F. Yu, T. Darrell and M. Sun: Monocular Quasi-Dense 3D Object Tracking. ArXiv:2103.07351 2021.\\
SMOKE & & 14.49 \% & 20.83 \% & 12.75 \% & 0.03 s / GPU & Z. Liu, Z. Wu and R. Tóth: SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. 2020.\\
RTM3D & & 14.20 \% & 19.17 \% & 11.99 \% & 0.05 s / GPU & P. Li, H. Zhao, P. Liu and F. Cao: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving. 2020.\\
Mono3D\_PLiDAR & & 13.92 \% & 21.27 \% & 11.25 \% & 0.1 s / & X. Weng and K. Kitani: Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. arXiv:1903.09847 2019.\\
M3D-RPN & & 13.67 \% & 21.02 \% & 10.23 \% & 0.16 s / GPU & G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal Network for Object Detection . ICCV 2019 .\\
CSoR & la & 13.07 \% & 18.67 \% & 10.34 \% & 3.5 s / 4 cores & L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.\\
mdab & & 12.67 \% & 18.79 \% & 10.41 \% & 0.02 s / 1 core & \\
MonoPSR & & 12.58 \% & 18.33 \% & 9.91 \% & 0.2 s / GPU & J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction. CVPR 2019.\\
Plane-Constraints & & 12.06 \% & 17.31 \% & 10.05 \% & 0.05 s / 4 cores & H. Yao, J. Chen, Z. Wang, X. Wang, X. Chai, Y. Qiu and P. Han: Vertex points are not enough: Monocular 3D object detection via intra-and inter-plane constraints. Neural Networks 2023.\\
MonoCInIS & & 11.64 \% & 22.28 \% & 9.95 \% & 0,13 s / GPU & J. Heylen, M. De Wolf, B. Dawagne, M. Proesmans, L. Van Gool, W. Abbeloos, H. Abdelkawy and D. Reino: MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.\\
SS3D & & 11.52 \% & 16.33 \% & 9.93 \% & 48 ms / & E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. CoRR 2019.\\
mdab & & 11.47 \% & 17.81 \% & 9.08 \% & 0.02 s / 1 core & \\
MonoGRNet & & 11.17 \% & 18.19 \% & 8.73 \% & 0.04s / & Z. Qin, J. Wang and Y. Lu: MonoGRNet: A Geometric Reasoning Network for 3D Object Localization. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) 2019.\\
MonoFENet & & 11.03 \% & 17.03 \% & 9.05 \% & 0.15 s / 1 core & W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.\\
MonoCInIS & & 10.96 \% & 20.42 \% & 9.23 \% & 0,14 s / GPU & J. Heylen, M. De Wolf, B. Dawagne, M. Proesmans, L. Van Gool, W. Abbeloos, H. Abdelkawy and D. Reino: MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.\\
A3DODWTDA (image) & & 8.66 \% & 10.37 \% & 7.06 \% & 0.8 s / GPU & F. Gustafsson and E. Linder-Norén: Automotive 3D Object Detection Without Target Domain Annotations. 2018.\\
TLNet (Stereo) & st & 7.69 \% & 13.71 \% & 6.73 \% & 0.1 s / 1 core & Z. Qin, J. Wang and Y. Lu: Triangulation Learning Network: from Monocular to Stereo 3D Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
Shift R-CNN (mono) & & 6.82 \% & 11.84 \% & 5.27 \% & 0.25 s / GPU & A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D Object Detection With Closed-form Geometric Constraints. ICIP 2019.\\
SparVox3D & & 6.39 \% & 10.20 \% & 5.06 \% & 0.05 s / GPU & E. Balatkan and F. Kıraç: Improving Regression Performance on Monocular 3D Object Detection Using Bin-Mixing and Sparse Voxel Data. 2021 6th International Conference on Computer Science and Engineering (UBMK) 2021.\\
GS3D & & 6.08 \% & 8.41 \% & 4.94 \% & 2 s / 1 core & B. Li, W. Ouyang, L. Sheng, X. Zeng and X. Wang: GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
MVRA + I-FRCNN+ & & 5.84 \% & 9.05 \% & 4.50 \% & 0.18 s / GPU & H. Choi, H. Kang and Y. Hyun: Multi-View Reprojection Architecture for Orientation Estimation. The IEEE International Conference on Computer Vision (ICCV) Workshops 2019.\\
WeakM3D & & 5.66 \% & 11.82 \% & 4.08 \% & 0.08 s / 1 core & L. Peng, S. Yan, B. Wu, Z. Yang, X. He and D. Cai: WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection. ICLR 2022.\\
ROI-10D & & 4.91 \% & 9.78 \% & 3.74 \% & 0.2 s / GPU & F. Manhardt, W. Kehl and A. Gaidon: ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape. Computer Vision and Pattern Recognition (CVPR) 2019.\\
3D-GCK & & 4.57 \% & 5.79 \% & 3.64 \% & 24 ms / & N. Gählert, J. Wan, N. Jourdan, J. Finkbeiner, U. Franke and J. Denzler: Single-Shot 3D Detection of Vehicles from Monocular RGB Images via Geometrically Constrained Keypoints in Real-Time. 2020 IEEE Intelligent Vehicles Symposium (IV) 2020.\\
FQNet & & 3.23 \% & 5.40 \% & 2.46 \% & 0.5 s / 1 core & L. Liu, J. Lu, C. Xu, Q. Tian and J. Zhou: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.\\
3D-SSMFCNN & & 2.63 \% & 3.20 \% & 2.40 \% & 0.1 s / GPU & L. Novak: Vehicle Detection and Pose Estimation for Autonomous Driving. 2017.\\
VeloFCN & la & 0.14 \% & 0.02 \% & 0.21 \% & 1 s / GPU & B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .\\
f3sd & & 0.01 \% & 0.00 \% & 0.01 \% & 1.67 s / 1 core & \\
multi-task CNN & & 0.00 \% & 0.00 \% & 0.00 \% & 25.1 ms / GPU & M. Oeljeklaus, F. Hoffmann and T. Bertram: A Fast Multi-Task CNN for Spatial Understanding of Traffic Scenes. IEEE Intelligent Transportation Systems Conference 2018.\\
mBoW & la & 0.00 \% & 0.00 \% & 0.00 \% & 10 s / 1 core & J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
\end{tabular}