\begin{tabular}{c | c | c | c | c | c | c}
{\bf Method} & {\bf Setting} & {\bf Moderate} & {\bf Easy} & {\bf Hard} & {\bf Runtime} & {\bf Environment}\\ \hline
BiProDet & & 74.32 \% & 86.74 \% & 67.45 \% & 0.1 s / GPU & \\
TED & & 74.12 \% & 88.82 \% & 66.84 \% & 0.1 s / 1 core & H. Wu, C. Wen, W. Li, R. Yang and C. Wang: Transformation-Equivariant 3D Object
Detection for Autonomous Driving. AAAI 2023.\\
CasA++ & & 73.79 \% & 87.76 \% & 66.84 \% & 0.1 s / 1 core & H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D
Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and
Remote Sensing 2022.\\
CasA & & 73.47 \% & 87.91 \% & 66.17 \% & 0.1 s / 1 core & H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D
Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and
Remote Sensing 2022.\\
3D HA Net & & 73.38 \% & 87.74 \% & 66.37 \% & 0.1 s / 1 core & Q. Xia, Y. Chen, G. Cai, G. Chen, D. Xie, J. Su and Z. Wang: 3D HANet: A Flexible 3D Heatmap Auxiliary
Network for Object Detection. IEEE Transactions on Geoscience and
Remote Sensing 2023.\\
LoGoNet & & 71.70 \% & 84.47 \% & 64.67 \% & 0.1 s / 1 core & X. Li, T. Ma, Y. Hou, B. Shi, Y. Yang, Y. Liu, X. Wu, Q. Chen, Y. Li, Y. Qiao and others: LoGoNet: Towards Accurate 3D Object
Detection with Local-to-Global Cross-Modal Fusion. CVPR 2023.\\
USVLab BSAODet & & 70.48 \% & 83.17 \% & 62.46 \% & 0.04 s / 1 core & W. Xiao, Y. Peng, C. Liu, J. Gao, Y. Wu and X. Li: Balanced Sample Assignment and Objective
for Single-Model Multi-Class 3D Object Detection. IEEE Transactions on Circuits and
Systems for Video Technology 2023.\\
HMFI & & 70.37 \% & 84.02 \% & 62.57 \% & 0.1 s / 1 core & X. Li, B. Shi, Y. Hou, X. Wu, T. Ma, Y. Li and L. He: Homogeneous Multi-modal Feature Fusion and
Interaction for 3D Object Detection. ECCV 2022.\\
IMLIDAR(base) & & 69.60 \% & 84.81 \% & 62.64 \% & 0.1 s / 1 core & \\
HPV-RCNN & & 69.56 \% & 84.24 \% & 61.42 \% & 0.08 s / 1 core & \\
EQ-PVRCNN & & 69.10 \% & 85.41 \% & 62.30 \% & 0.2 s / GPU & Z. Yang, L. Jiang, Y. Sun, B. Schiele and J. Jia: A Unified Query-based Paradigm for Point Cloud
Understanding. Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition 2022.\\
PV-PMRTNet & & 69.03 \% & 83.08 \% & 61.47 \% & 0.1 s / 1 core & \\
CAT-Det & & 68.81 \% & 83.68 \% & 61.45 \% & 0.3 s / GPU & Y. Zhang, J. Chen and D. Huang: CAT-Det: Contrastively Augmented Transformer
for Multi-modal 3D Object Detection. CVPR 2022.\\
CZY\_PPF\_Net2 & & 68.79 \% & 82.21 \% & 61.13 \% & 0.1 s / 1 core & \\
BtcDet & la & 68.68 \% & 82.81 \% & 61.81 \% & 0.09 s / GPU & Q. Xu, Y. Zhong and U. Neumann: Behind the Curtain: Learning Occluded
Shapes for 3D Object Detection. Proceedings of the AAAI Conference on
Artificial Intelligence 2022.\\
SPT & & 68.60 \% & 84.90 \% & 61.69 \% & 0.1 s / GPU & \\
Anonymous & & 68.57 \% & 82.84 \% & 60.51 \% & 0.1 s / 1 core & \\
DSA-PV-RCNN & la & 68.54 \% & 82.19 \% & 61.33 \% & 0.08 s / 1 core & P. Bhattacharyya, C. Huang and K. Czarnecki: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection. 2021.\\
PA-Det3D & & 68.48 \% & 82.91 \% & 61.93 \% & 0.06 s / 1 core & \\
ACF-Net & & 68.37 \% & 84.29 \% & 62.08 \% & n/a s / 1 core & \\
PIPC-3Ddet & & 68.28 \% & 84.15 \% & 60.69 \% & 0.05 s / 1 core & \\
CZY\_PPF\_Net & & 68.23 \% & 83.46 \% & 62.05 \% & 0.1 s / 1 core & \\
Semantical PVRCNN & & 68.21 \% & 83.46 \% & 61.17 \% & 0.07 s / 1 core & \\
Rnet & & 68.05 \% & 81.20 \% & 60.64 \% & 0.1 s / 1 core & \\
PA-RCNN & & 68.04 \% & 83.32 \% & 59.88 \% & 0.05 s / 1 core & \\
Under Blind Review#2 & & 68.03 \% & 81.55 \% & 60.51 \% & 0.1 s / 1 core & \\
PDV & & 67.81 \% & 83.04 \% & 60.46 \% & 0.1 s / 1 core & J. Hu, T. Kuai and S. Waslander: Point Density-Aware Voxels for LiDAR 3D Object Detection. CVPR 2022.\\
RangeIoUDet & la & 67.77 \% & 83.12 \% & 60.26 \% & 0.02 s / GPU & Z. Liang, Z. Zhang, M. Zhang, X. Zhao and S. Pu: RangeIoUDet: Range Image Based Real-Time
3D
Object Detector Optimized by Intersection Over
Union. CVPR 2021.\\
SGDA3D & & 67.55 \% & 82.10 \% & 60.70 \% & 0.07 s / 1 core & \\
Anomynous & & 67.53 \% & 83.37 \% & 60.58 \% & 0.09 s / 1 core & \\
3ONet & & 67.53 \% & 83.37 \% & 60.58 \% & 0.09 s / 1 core & \\
DCAN-Second & & 67.50 \% & 84.90 \% & 60.78 \% & 0.05 s / 1 core & \\
MPFusion & & 67.17 \% & 83.96 \% & 60.37 \% & 0.1 s / 1 core & \\
SPG\_mini & la & 66.96 \% & 80.21 \% & 60.50 \% & 0.09 s / GPU & Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for
3D Object Detection via Semantic Point
Generation. Proceedings of the IEEE conference on
computer vision and pattern recognition (ICCV) 2021.\\
POP-RCNN & & 66.96 \% & 84.01 \% & 60.23 \% & 0.1 s / 1 core & \\
FEMV-RCNN & & 66.95 \% & 83.31 \% & 60.40 \% & 0.03 s / 1 core & \\
M3DeTR & & 66.74 \% & 83.83 \% & 59.03 \% & n/a s / GPU & T. Guan, J. Wang, S. Lan, R. Chandra, Z. Wu, L. Davis and D. Manocha: M3DeTR: Multi-representation, Multi-
scale, Mutual-relation 3D Object Detection with
Transformers. 2021.\\
ACDet & & 66.61 \% & 83.80 \% & 59.99 \% & 0.05 s / 1 core & J. Xu, G. Wang, X. Zhang and G. Wan: ACDet: Attentive Cross-view Fusion
for LiDAR-based 3D Object Detection. 3DV 2022.\\
Anonymous & la & 66.46 \% & 82.62 \% & 60.09 \% & 0.05 s / GPU & \\
F3D & & 66.45 \% & 83.49 \% & 59.60 \% & 0.01 s / 1 core & \\
LGSL & & 66.36 \% & 77.80 \% & 60.45 \% & 0.1 s / GPU & \\
IA-SSD (single) & & 66.25 \% & 82.36 \% & 59.70 \% & 0.013 s / 1 core & Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly
Efficient Point-based Detectors for 3D LiDAR Point
Clouds. CVPR 2022.\\
Anonymous & & 66.14 \% & 82.06 \% & 58.06 \% & 0.1 s / 1 core & \\
DTSSD & & 66.12 \% & 80.61 \% & 60.10 \% & 0.1 s / 1 core & \\
DTSSD & & 66.12 \% & 80.96 \% & 59.50 \% & 0.1 s / 1 core & \\
HybridPillars & & 66.05 \% & 81.42 \% & 59.59 \% & 0.05 s / 1 core & \\
CZY & & 65.97 \% & 82.86 \% & 58.33 \% & 0.1 s / 1 core & \\
HotSpotNet & & 65.95 \% & 82.59 \% & 59.00 \% & 0.04 s / 1 core & Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille: object as hotspots. Proceedings of the European Conference
on Computer Vision (ECCV) 2020.\\
DFAF3D & & 65.86 \% & 82.09 \% & 59.02 \% & 0.05 s / 1 core & Q. Tang, X. Bai, J. Guo, B. Pan and W. Jiang: DFAF3D: A dual-feature-aware anchor-free
single-stage 3D detector for point clouds. Image and Vision Computing 2023.\\
CZY\_3917 & & 65.64 \% & 80.45 \% & 58.48 \% & 0.1 s / 1 core & \\
MMF & & 65.39 \% & 80.82 \% & 59.10 \% & 1 s / 1 core & \\
Fast-CLOCs & & 65.31 \% & 82.83 \% & 57.43 \% & 0.1 s / GPU & S. Pang, D. Morris and H. Radha: Fast-CLOCs: Fast Camera-LiDAR
Object Candidates Fusion for 3D Object Detection. Proceedings of the IEEE/CVF
Winter Conference on Applications of Computer
Vision (WACV) 2022.\\
ACCF & & 65.25 \% & 81.16 \% & 58.98 \% & 0.02 s / 1 core & \\
RPF3D & & 65.25 \% & 79.93 \% & 58.52 \% & 0.1 s / 1 core & \\
VGA-RCNN & & 65.19 \% & 79.70 \% & 58.52 \% & 0.07 s / 1 core & \\
IKT3D & la & 65.17 \% & 79.88 \% & 58.09 \% & 0.05 s / 1 core & \\
F-ConvNet & la & 65.07 \% & 81.98 \% & 56.54 \% & 0.47 s / GPU & Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to
Aggregate Local Point-Wise Features for Amodal 3D
Object Detection. IROS 2019.\\
DA-Net & & 64.98 \% & 80.36 \% & 60.40 \% & 0.1 s / 1 core & \\
VPNetv2 & & 64.91 \% & 82.05 \% & 58.13 \% & 0.1 s / 1 core & \\
MVMM & & 64.81 \% & 77.82 \% & 58.79 \% & 0.04 s / GPU & \\
DGT-Det3D & & 64.80 \% & 78.06 \% & 58.08 \% & 0.02 s / 1 core & \\
IA-SSDx & & 64.72 \% & 78.51 \% & 57.11 \% & 0.01 s / 1 core & \\
casx & & 64.72 \% & 78.51 \% & 57.11 \% & 0.01 s / 1 core & \\
IPS & & 64.62 \% & 80.78 \% & 58.09 \% & TBD s / 1 core & \\
PVTr & & 64.46 \% & 81.73 \% & 57.71 \% & 0.1 s / 1 core & \\
GraphAlign & & 64.43 \% & 78.42 \% & 58.71 \% & 0.03 s / GPU & \\
GS & & 64.37 \% & 79.17 \% & 57.47 \% & TBD s / 1 core & \\
PEF & & 64.25 \% & 80.60 \% & 56.47 \% & N/A s / 1 core & \\
3DSSD & & 64.10 \% & 82.48 \% & 56.90 \% & 0.04 s / GPU & Z. Yang, Y. Sun, S. Liu and J. Jia: 3DSSD: Point-based 3D Single Stage Object
Detector. CVPR 2020.\\
VPFNet & & 64.10 \% & 77.64 \% & 58.00 \% & 0.2 s / 1 core & C. Wang, H. Chen and L. Fu: VPFNet: Voxel-Pixel Fusion Network
for Multi-class 3D Object Detection. 2021.\\
MVENet & & 63.96 \% & 76.57 \% & 57.89 \% & 0.02 s / 1 core & \\
STNet & & 63.80 \% & 80.11 \% & 56.37 \% & 0.60 s / 1 core & \\
PointPainting & la & 63.78 \% & 77.63 \% & 55.89 \% & 0.4 s / GPU & S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object
Detection. CVPR 2020.\\
U\_RVRCNN\_V2\_1 & & 63.74 \% & 77.85 \% & 57.06 \% & 0.1 s / 1 core & \\
MMLab PV-RCNN & la & 63.71 \% & 78.60 \% & 57.65 \% & 0.08 s / 1 core & S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set
Abstraction for
3D Object Detection. CVPR 2020.\\
LightCPC & & 63.71 \% & 80.15 \% & 56.66 \% & 0.02 s / 1 core & \\
WGVRF & & 63.58 \% & 78.81 \% & 57.27 \% & 0.1 s / 1 core & \\
NV-RCNN & & 63.57 \% & 80.12 \% & 56.18 \% & 0.1 s / 1 core & \\
MMLab-PartA^2 & la & 63.52 \% & 79.17 \% & 56.93 \% & 0.08 s / GPU & S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from
Point Cloud with Part-aware and Part-aggregation
Network. IEEE Transactions on Pattern Analysis and
Machine Intelligence 2020.\\
Point-GNN & la & 63.48 \% & 78.60 \% & 57.08 \% & 0.6 s / GPU & W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D
Object Detection in a Point Cloud. CVPR 2020.\\
MGAF-3DSSD & & 63.43 \% & 80.64 \% & 55.15 \% & 0.1 s / 1 core & J. Li, H. Dai, L. Shao and Y. Ding: Anchor-free 3D Single Stage
Detector with Mask-Guided Attention for Point
Cloud. MM '21: The 29th ACM
International Conference on Multimedia (ACM MM) 2021.\\
FromVoxelToPoint & & 63.41 \% & 81.49 \% & 56.40 \% & 0.1 s / 1 core & J. Li, H. Dai, L. Shao and Y. Ding: From Voxel to Point: IoU-guided 3D
Object Detection for Point Cloud with Voxel-to-
Point Decoder. MM '21: The 29th ACM
International Conference on Multimedia (ACM MM) 2021.\\
B2PE & & 63.18 \% & 76.71 \% & 56.16 \% & 0.02 s / 1 core & \\
RealSynthesis-SECOND & & 63.16 \% & 81.44 \% & 56.24 \% & 0.05 s / 1 core & \\
P2V-RCNN & & 63.13 \% & 78.62 \% & 56.81 \% & 0.1 s / 2 cores & J. Li, S. Luo, Z. Zhu, H. Dai, A. Krylov, Y. Ding and L. Shao: P2V-RCNN: Point to Voxel Feature
Learning for 3D Object Detection from Point
Clouds. IEEE Access 2021.\\
PSA-SSD & & 62.87 \% & 76.36 \% & 56.99 \% & 0.01 s / 1 core & \\
H^23D R-CNN & & 62.74 \% & 78.67 \% & 55.78 \% & 0.03 s / 1 core & J. Deng, W. Zhou, Y. Zhang and H. Li: From Multi-View to Hollow-3D: Hallucinated
Hollow-3D R-CNN for 3D Object Detection. IEEE Transactions on Circuits and Systems
for Video Technology 2021.\\
U\_PVRCNN\_V2 & & 62.50 \% & 75.08 \% & 55.32 \% & 0.1 s / 1 core & \\
TTT\_SSD & & 62.42 \% & 76.07 \% & 56.39 \% & TBD s / 1 core & \\
VPNet & & 62.38 \% & 77.56 \% & 55.92 \% & 0.1 s / 1 core & \\
SVGA-Net & & 62.28 \% & 78.58 \% & 54.88 \% & 0.03s / 1 core & Q. He, Z. Wang, H. Zeng, Y. Zeng and Y. Liu: SVGA-Net: Sparse Voxel-Graph Attention
Network for 3D Object Detection from Point
Clouds. AAAI 2022.\\
AGS-SSD[la] & & 62.15 \% & 77.40 \% & 56.14 \% & 0.04 s / 1 core & \\
SRDL & & 62.02 \% & 77.35 \% & 55.52 \% & 0.05 s / 1 core & ERROR: Wrong syntax in BIBTEX file.\\
Faraway-Frustum & la & 62.00 \% & 77.36 \% & 55.40 \% & 0.1 s / GPU & H. Zhang, D. Yang, E. Yurtsever, K. Redmill and U. Ozguner: Faraway-frustum: Dealing with lidar sparsity for 3D object detection using fusion. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) 2021.\\
DVFENet & & 62.00 \% & 78.73 \% & 55.18 \% & 0.05 s / 1 core & Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang: DVFENet: Dual-branch Voxel Feature
Extraction Network for 3D Object Detection. Neurocomputing 2021.\\
PVRCNN\_8369 & & 61.99 \% & 77.33 \% & 55.51 \% & 0.1 s / 1 core & \\
IA-SSD (multi) & & 61.94 \% & 78.35 \% & 55.70 \% & 0.014 s / 1 core & Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly
Efficient Point-based Detectors for 3D LiDAR Point
Clouds. CVPR 2022.\\
PSA-Det3D & & 61.79 \% & 75.82 \% & 55.12 \% & 0.1 s / GPU & \\
S-AT GCN & & 61.70 \% & 75.24 \% & 55.32 \% & 0.02 s / GPU & L. Wang, C. Wang, X. Zhang, T. Lan and J. Li: S-AT GCN: Spatial-Attention
Graph Convolution Network based Feature
Enhancement for 3D Object
Detection. CoRR 2021.\\
SIF & & 61.61 \% & 77.13 \% & 55.11 \% & 0.1 s / 1 core & P. An: SIF. Submitted to CVIU 2021.\\
ATT\_SSD & & 61.61 \% & 77.19 \% & 55.62 \% & 0.01 s / 1 core & \\
STD & & 61.59 \% & 78.69 \% & 55.30 \% & 0.08 s / GPU & Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia: STD: Sparse-to-Dense 3D Object Detector for
Point Cloud. ICCV 2019.\\
HybridPillars (SSD) & & 61.49 \% & 76.32 \% & 55.77 \% & 0.02 s / 1 core & \\
GEO\_LOC & & 61.37 \% & 75.64 \% & 55.22 \% & TBD s / 1 core & \\
GS-FPS-LT & & 61.15 \% & 76.16 \% & 54.65 \% & TBD s / 1 core & \\
SWA & & 61.12 \% & 76.47 \% & 55.51 \% & 0.18 s / 1 core & \\
GS-FPS & & 60.44 \% & 77.36 \% & 54.49 \% & TBD s / 1 core & \\
BASA & & 60.43 \% & 76.46 \% & 54.47 \% & 1s / 1 core & \\
AB3DMOT & la on & 60.30 \% & 75.42 \% & 53.81 \% & 0.0047s / 1 core & X. Weng and K. Kitani: A Baseline for 3D Multi-Object
Tracking. arXiv:1907.03961 2019.\\
OA-TSSD & & 60.03 \% & 76.09 \% & 53.43 \% & 20 s / 8 cores & \\
fuf & & 60.02 \% & 79.07 \% & 53.61 \% & 10 s / 1 core & \\
EPNet++ & & 59.71 \% & 76.15 \% & 53.67 \% & 0.1 s / GPU & Z. Liu, T. Huang, B. Li, X. Chen, X. Wang and X. Bai: EPNet++: Cascade Bi-Directional Fusion for
Multi-Modal 3D Object Detection. IEEE Transactions on
Pattern Analysis and Machine Intelligence 2022.\\
XView & & 59.55 \% & 77.24 \% & 53.47 \% & 0.1 s / 1 core & L. Xie, G. Xu, D. Cai and X. He: X-view: Non-egocentric Multi-View 3D
Object Detector. 2021.\\
TANet & & 59.44 \% & 75.70 \% & 52.53 \% & 0.035s / GPU & Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from
Point Clouds with Triple Attention. AAAI 2020.\\
DTE3D & & 59.12 \% & 76.99 \% & 52.97 \% & 0.15s / 1 core & \\
EOTL & & 58.96 \% & 75.20 \% & 50.41 \% & TBD s / 1 core & \\
MMLab-PointRCNN & la & 58.82 \% & 74.96 \% & 52.53 \% & 0.1 s / GPU & S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation
and
detection from point cloud. Proceedings of the IEEE Conference
on
Computer Vision and Pattern Recognition 2019.\\
PointPillars & la & 58.65 \% & 77.10 \% & 51.92 \% & 16 ms / & A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from
Point Clouds. CVPR 2019.\\
ARPNET & & 58.20 \% & 74.21 \% & 52.13 \% & 0.08 s / GPU & Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network
for 3D object detection. Science China Information Sciences 2019.\\
ZMMPP & & 58.03 \% & 71.72 \% & 51.89 \% & 0.1 s / 1 core & \\
U\_SECOND\_V4 & & 57.10 \% & 73.91 \% & 50.91 \% & 0.1 s / 1 core & \\
T\_PVRCNN & & 56.26 \% & 70.51 \% & 49.90 \% & 0.1 s / 1 core & \\
epBRM & la & 56.13 \% & 72.08 \% & 49.91 \% & 0.10 s / 1 core & K. Shin: Improving a Quality of 3D Object Detection
by Spatial Transformation Mechanism. arXiv preprint arXiv:1910.04853 2019.\\
F-PointNet & la & 56.12 \% & 72.27 \% & 49.01 \% & 0.17 s / GPU & C. Qi, W. Liu, C. Wu, H. Su and L. Guibas: Frustum PointNets for 3D Object Detection from RGB-D Data. arXiv preprint arXiv:1711.08488 2017.\\
SECOND\_7862 & & 55.64 \% & 71.05 \% & 49.83 \% & 1 s / 1 core & \\
CAD & st la & 55.39 \% & 70.98 \% & 48.81 \% & 0.1 s / 1 core & \\
T\_PVRCNN\_V2 & & 55.29 \% & 69.58 \% & 49.22 \% & 0.1 s / 1 core & \\
Voxel-MAE+SECOND & & 54.84 \% & 69.64 \% & 48.98 \% & 0.05 s / 1 core & \\
APDM & & 54.22 \% & 70.38 \% & 48.14 \% & 0.7 s / 1 core & \\
BirdNet+ & la & 53.84 \% & 65.67 \% & 49.06 \% & 0.11 s / & A. Barrera, J. Beltrán, C. Guindel, J. Iglesias and F. García: BirdNet+: Two-Stage 3D Object Detection
in LiDAR through a Sparsity-Invariant Bird’s Eye
View. IEEE Access 2021.\\
PointRGBNet & & 52.15 \% & 67.05 \% & 46.78 \% & 0.08 s / 4 cores & P. Xie Desheng: Real-time Detection of 3D Objects
Based on Multi-Sensor Information Fusion. Automotive Engineering 2022.\\
DMF & st & 51.33 \% & 65.51 \% & 45.05 \% & 0.2 s / 1 core & X. J. Chen and W. Xu: Disparity-Based Multiscale Fusion Network for
Transportation Detection. IEEE Transactions on Intelligent
Transportation Systems 2022.\\
PiFeNet & & 51.10 \% & 67.50 \% & 44.66 \% & 0.03 s / 1 core & D. Le, H. Shi, H. Rezatofighi and J. Cai: Accurate and Real-time 3D Pedestrian
Detection Using an Efficient Attentive Pillar
Network. IEEE Robotics and Automation Letters 2022.\\
MSAW & & 50.86 \% & 67.59 \% & 45.28 \% & 0.42 s / 2 cores & \\
SCNet & la & 50.79 \% & 67.98 \% & 45.15 \% & 0.04 s / GPU & Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.\\
AVOD-FPN & la & 50.55 \% & 63.76 \% & 44.93 \% & 0.1 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.\\
HA-PillarNet & & 49.94 \% & 65.41 \% & 44.04 \% & 0.05 s / 1 core & \\
MLOD & la & 49.43 \% & 68.81 \% & 42.84 \% & 0.12 s / GPU & J. Deng and K. Czarnecki: MLOD: A multi-view 3D object detection based on robust feature fusion method. arXiv preprint arXiv:1909.04163 2019.\\
BirdNet+ (legacy) & la & 47.72 \% & 67.38 \% & 42.89 \% & 0.1 s / & A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird’s Eye View. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.\\
PFF3D & la & 46.78 \% & 63.27 \% & 41.37 \% & 0.05 s / GPU & L. Wen and K. Jo: Fast and
Accurate 3D Object Detection for Lidar-Camera-Based
Autonomous Vehicles Using One Shared Voxel-Based
Backbone. IEEE Access 2021.\\
MLAFF & & 45.95 \% & 61.63 \% & 40.87 \% & 0.39 s / 2 cores & \\
MVAF-Net(3-classes) & & 45.43 \% & 61.02 \% & 40.77 \% & 0.1 s / 1 core & \\
StereoDistill & & 44.02 \% & 63.96 \% & 39.19 \% & 0.4 s / 1 core & Z. Liu, X. Ye, X. Tan, D. Errui, Y. Zhou and X. Bai: StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-based 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2023.\\
DSGN++ & st & 43.90 \% & 62.82 \% & 39.21 \% & 0.2 s / & Y. Chen, S. Huang, S. Liu, B. Yu and J. Jia: DSGN++: Exploiting Visual-Spatial Relation
for Stereo-Based 3D Detectors. IEEE Transactions on Pattern Analysis and
Machine Intelligence 2022.\\
MVAF-Net(3-classes) & & 43.74 \% & 59.00 \% & 39.42 \% & 0.1 s / 1 core & \\
AVOD & la & 42.08 \% & 57.19 \% & 38.29 \% & 0.08 s / & J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object
Detection from View Aggregation. IROS 2018.\\
SparsePool & & 37.33 \% & 52.61 \% & 33.39 \% & 0.13 s / 8 cores & Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and
front view camera image for deep object
detection. arXiv preprint arXiv:1711.06703 2017.\\
MMLAB LIGA-Stereo & st & 36.86 \% & 54.44 \% & 32.06 \% & 0.4 s / 1 core & X. Guo, S. Shi, X. Wang and H. Li: LIGA-Stereo: Learning LiDAR Geometry
Aware Representations for Stereo-based 3D
Detector. Proceedings of the IEEE/CVF
International Conference on Computer Vision
(ICCV) 2021.\\
SparsePool & & 32.61 \% & 40.87 \% & 29.05 \% & 0.13 s / 8 cores & Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and
front view camera image for deep object
detection. arXiv preprint arXiv:1711.06703 2017.\\
CG-Stereo & st & 30.89 \% & 47.40 \% & 27.23 \% & 0.57 s / & C. Li, J. Ku and S. Waslander: Confidence Guided Stereo 3D Object
Detection with
Split Depth Estimation. IROS 2020.\\
BirdNet & la & 30.25 \% & 43.98 \% & 27.21 \% & 0.11 s / & J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework
from LiDAR Information. 2018 21st International Conference on
Intelligent Transportation Systems (ITSC) 2018.\\
PS++ & & 28.66 \% & 44.45 \% & 24.96 \% & 0.4 s / 1 core & \\
Disp R-CNN (velo) & st & 24.40 \% & 40.05 \% & 21.12 \% & 0.387 s / GPU & J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via
Shape Prior Guided Instance Disparity Estimation. CVPR 2020.\\
Disp R-CNN & st & 24.40 \% & 40.04 \% & 21.12 \% & 0.387 s / GPU & J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection
via Shape Prior Guided Instance Disparity
Estimation. CVPR 2020.\\
Complexer-YOLO & la & 18.53 \% & 24.27 \% & 17.31 \% & 0.06 s / GPU & M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object
Detection and Tracking on Semantic Point
Clouds. The IEEE Conference on Computer
Vision and Pattern Recognition (CVPR)
Workshops 2019.\\
DSGN & st & 18.17 \% & 27.76 \% & 16.21 \% & 0.67 s / & Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D
Object Detection. CVPR 2020.\\
OC Stereo & st & 16.63 \% & 29.40 \% & 14.72 \% & 0.35 s / 1 core & A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D
Object Detection. ICRA 2020.\\
RT3D-GMP & st & 12.99 \% & 18.31 \% & 10.63 \% & 0.06 s / GPU & H. Königshof and C. Stiller: Learning-Based Shape Estimation with Grid Map Patches for Realtime 3D Object Detection for Automated Driving. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.\\
ESGN & st & 7.69 \% & 13.84 \% & 6.75 \% & 0.06 s / GPU & A. Gao, Y. Pang, J. Nie, Z. Shao, J. Cao, Y. Guo and X. Li: ESGN: Efficient Stereo Geometry Network
for Fast 3D Object Detection. IEEE Transactions on Circuits and
Systems for Video Technology 2022.\\
CMKD & & 6.67 \% & 12.52 \% & 6.34 \% & 0.1 s / 1 core & Y. Hong, H. Dai and Y. Ding: Cross-Modality Knowledge
Distillation Network for Monocular 3D Object
Detection. ECCV 2022.\\
PS-fld & & 6.18 \% & 11.22 \% & 5.21 \% & 0.25 s / 1 core & Y. Chen, H. Dai and Y. Ding: Pseudo-Stereo for Monocular 3D Object
Detection in Autonomous Driving. Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition (CVPR) 2022.\\
Anonymous & & 5.74 \% & 9.52 \% & 4.66 \% & 0.1 s / 1 core & \\
DD3Dv2 & & 5.68 \% & 8.79 \% & 4.75 \% & 0.1 s / 1 core & \\
BSM3D & & 5.61 \% & 9.45 \% & 4.81 \% & 0.03 s / 1 core & \\
MonoLiG & & 5.24 \% & 8.14 \% & 4.45 \% & 0.03 s / 1 core & \\
BAIR & & 4.97 \% & 8.17 \% & 4.62 \% & 0.04 s / 1 core & \\
Mix-Teaching & & 4.91 \% & 8.04 \% & 4.15 \% & 30 s / 1 core & L. Yang, X. Zhang, L. Wang, M. Zhu, C. Zhang and J. Li: Mix-Teaching: A Simple, Unified and
Effective Semi-Supervised Learning Framework for
Monocular 3D Object Detection. ArXiv 2022.\\
DD3D & & 4.79 \% & 7.52 \% & 4.22 \% & n/a s / 1 core & D. Park, R. Ambrus, V. Guizilini, J. Li and A. Gaidon: Is Pseudo-Lidar needed for Monocular 3D
Object detection?. IEEE/CVF International Conference on
Computer Vision (ICCV) .\\
MonoPSR & & 4.74 \% & 8.37 \% & 3.68 \% & 0.2 s / GPU & J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging
Accurate Proposals and Shape Reconstruction. CVPR 2019.\\
DD3D-dequity & & 4.61 \% & 7.32 \% & 4.10 \% & 0.1 s / 1 core & \\
TopNet-UncEst & la & 4.54 \% & 7.13 \% & 3.81 \% & 0.09 s / & S. Wirges, M. Braun, M. Lauer and C. Stiller: Capturing
Object Detection Uncertainty in Multi-Layer Grid
Maps. 2019.\\
LPCG-Monoflex & & 4.38 \% & 6.98 \% & 3.56 \% & 0.03 s / 1 core & L. Peng, F. Liu, Z. Yu, S. Yan, D. Deng, Z. Yang, H. Liu and D. Cai: Lidar Point Cloud Guided Monocular 3D
Object Detection. ECCV 2022.\\
MonoLSS & & 4.34 \% & 7.23 \% & 3.92 \% & 0.04 s / 1 core & \\
MonoUNI & & 4.28 \% & 7.34 \% & 3.78 \% & 0.04 s / 1 core & \\
3DSeMoDLE & & 4.24 \% & 7.04 \% & 3.56 \% & 0.1 s / 1 core & \\
Plane-Constraints & & 4.22 \% & 7.72 \% & 3.36 \% & 0.05 s / 4 cores & H. Yao, J. Chen, Z. Wang, X. Wang, X. Chai, Y. Qiu and P. Han: Vertex points are not enough: Monocular
3D object detection via intra-and inter-plane
constraints. Neural Networks 2023.\\
MonoAD & & 4.22 \% & 6.59 \% & 3.52 \% & 0.03 s / GPU & \\
MM3D & & 3.99 \% & 7.46 \% & 3.22 \% & NA s / 1 core & \\
Anonymous & & 3.94 \% & 6.49 \% & 3.25 \% & 0.03 s / 1 core & \\
MonoInsight & & 3.92 \% & 6.23 \% & 3.27 \% & 0.03 s / 1 core & \\
MonoInsight & & 3.92 \% & 6.23 \% & 3.27 \% & 0.03 s / 1 core & \\
MonoDDE & & 3.78 \% & 5.94 \% & 3.33 \% & 0.04 s / 1 core & Z. Li, Z. Qu, Y. Zhou, J. Liu, H. Wang and L. Jiang: Diversity Matters: Fully Exploiting Depth
Clues for Reliable Monocular 3D Object Detection. CVPR 2022.\\
MonoATT\_V2 & & 3.68 \% & 5.74 \% & 2.94 \% & 0.03 s / 1 core & \\
DFR-Net & & 3.58 \% & 5.69 \% & 3.10 \% & 0.18 s / & Z. Zou, X. Ye, L. Du, X. Cheng, X. Tan, L. Zhang, J. Feng, X. Xue and E. Ding:
The devil is in the task: Exploiting reciprocal
appearance-localization features for monocular 3d
object detection
. ICCV 2021.\\
OccupancyM3D & & 3.56 \% & 7.37 \% & 2.84 \% & 0.11 s / 1 core & \\
BCA & & 3.54 \% & 5.89 \% & 3.34 \% & 0.17 s / GPU & \\
HomoLoss(monoflex) & & 3.50 \% & 5.48 \% & 2.99 \% & 0.04 s / 1 core & J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homography Loss for Monocular 3D Object
Detection. Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition (CVPR) 2022.\\
OPA-3D & & 3.45 \% & 5.16 \% & 2.86 \% & 0.04 s / 1 core & Y. Su, Y. Di, G. Zhai, F. Manhardt, J. Rambach, B. Busam, D. Stricker and F. Tombari: OPA-3D: Occlusion-Aware Pixel-Wise
Aggregation for Monocular 3D Object Detection. IEEE Robotics and Automation Letters 2023.\\
CaDDN & & 3.41 \% & 7.00 \% & 3.30 \% & 0.63 s / GPU & C. Reading, A. Harakeh, J. Chae and S. Waslander: Categorical Depth Distribution
Network for Monocular 3D Object Detection. CVPR 2021.\\
MonoInsight & & 3.37 \% & 5.94 \% & 3.22 \% & 0.03 s / 1 core & \\
RT3DStereo & st & 3.37 \% & 5.29 \% & 2.57 \% & 0.08 s / GPU & H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.\\
MonoDTR & & 3.27 \% & 5.05 \% & 3.19 \% & 0.04 s / 1 core & K. Huang, T. Wu, H. Su and W. Hsu: MonoDTR: Monocular 3D Object Detection with
Depth-Aware Transformer. CVPR 2022.\\
GUPNet & & 3.21 \% & 5.58 \% & 2.66 \% & NA s / 1 core & Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan and W. Ouyang: Geometry Uncertainty Projection Network
for Monocular 3D Object Detection. arXiv preprint arXiv:2107.13774 2021.\\
DEVIANT & & 3.13 \% & 5.05 \% & 2.59 \% & 0.04 s / & A. Kumar, G. Brazil, E. Corona, A. Parchami and X. Liu: DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection. European Conference on Computer Vision (ECCV) 2022.\\
CIE & & 3.09 \% & 5.62 \% & 2.80 \% & 0.1 s / 1 core & Anonymities: Consistency of Implicit and Explicit
Features Matters for Monocular 3D Object
Detection. arXiv preprint arXiv:2207.07933 2022.\\
SparseLiDAR\_fusion & & 3.02 \% & 5.89 \% & 2.50 \% & 0.08 s / 1 core & \\
SGM3D & & 2.92 \% & 5.49 \% & 2.64 \% & 0.03 s / 1 core & Z. Zhou, L. Du, X. Ye, Z. Zou, X. Tan, L. Zhang, X. Xue and J. Feng: SGM3D: Stereo Guided Monocular 3D Object
Detection. RA-L 2022.\\
AMNet & & 2.79 \% & 4.30 \% & 2.51 \% & 0.03 s / GPU & \\
DCD & & 2.74 \% & 4.72 \% & 2.41 \% & 1 s / 1 core & \\
MDSNet & & 2.68 \% & 5.37 \% & 2.22 \% & 0.05 s / 1 core & Z. Xie, Y. Song, J. Wu, Z. Li, C. Song and Z. Xu: MDS-Net: Multi-Scale Depth Stratification
3D Object Detection from Monocular Images. Sensors 2022.\\
Cube R-CNN & & 2.67 \% & 3.65 \% & 2.28 \% & 0.05 s / GPU & G. Brazil, A. Kumar, J. Straub, N. Ravi, J. Johnson and G. Gkioxari: Omni3D: A Large Benchmark and
Model for 3D Object Detection in the Wild. CVPR 2023.\\
monodle & & 2.66 \% & 4.59 \% & 2.45 \% & 0.04 s / GPU & X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li and W. Ouyang: Delving into Localization Errors for
Monocular 3D Object Detection. CVPR 2021 .\\
DDMP-3D & & 2.50 \% & 4.18 \% & 2.32 \% & 0.18 s / 1 core & L. Wang, L. Du, X. Ye, Y. Fu, G. Guo, X. Xue, J. Feng and L. Zhang: Depth-conditioned Dynamic Message Propagation for
Monocular 3D Object Detection. CVPR 2020.\\
MonoNeRD & & 2.48 \% & 4.73 \% & 2.16 \% & na s / 1 core & \\
Aug3D-RPN & & 2.43 \% & 4.36 \% & 2.55 \% & 0.08 s / 1 core & C. He, J. Huang, X. Hua and L. Zhang: Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth. 2021.\\
MonoXiver & & 2.41 \% & 3.62 \% & 2.04 \% & 0.03s / GPU & \\
QD-3DT & on & 2.39 \% & 4.16 \% & 1.85 \% & 0.03 s / GPU & H. Hu, Y. Yang, T. Fischer, F. Yu, T. Darrell and M. Sun: Monocular Quasi-Dense 3D Object Tracking. ArXiv:2103.07351 2021.\\
MonoFlex & & 2.35 \% & 4.17 \% & 2.04 \% & 0.03 s / GPU & Y. Zhang, J. Lu and J. Zhou: Objects are Different: Flexible Monocular 3D
Object Detection. CVPR 2021.\\
Mono3DMethod & & 2.30 \% & 3.79 \% & 2.01 \% & 0.1 s / 1 core & \\
MonoA^2 & & 2.28 \% & 4.39 \% & 2.31 \% & na s / 1 core & \\
MonoPair & & 2.12 \% & 3.79 \% & 1.83 \% & 0.06 s / GPU & Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection
Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision
and Pattern Recognition (CVPR) 2020.\\
MonoPCNS & & 2.09 \% & 4.07 \% & 2.12 \% & 0.14 s / GPU & \\
RefinedMPL & & 1.82 \% & 3.23 \% & 1.77 \% & 0.15 s / GPU & J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR
for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.\\
MonoRCNN++ & & 1.81 \% & 3.17 \% & 1.75 \% & 0.07 s / GPU & X. Shi, Z. Chen and T. Kim: Multivariate Probabilistic Monocular 3D
Object Detection. WACV 2023.\\
TopNet-HighRes & la & 1.67 \% & 2.49 \% & 1.88 \% & 101ms / & S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in
Occupancy Grid Maps Using Deep Convolutional
Networks. 2018 21st International Conference on
Intelligent Transportation Systems (ITSC) 2018.\\
D4LCN & & 1.67 \% & 2.45 \% & 1.36 \% & 0.2 s / GPU & M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for
Monocular 3D Object Detection. CVPR 2020.\\
FMF-occlusion-net & & 1.60 \% & 1.87 \% & 1.66 \% & 0.16 s / 1 core & H. Liu, H. Liu, Y. Wang, F. Sun and W. Huang: Fine-grained Multi-level Fusion for Anti-
occlusion Monocular 3D Object Detection. IEEE Transactions on Image Processing 2022.\\
SS3D & & 1.45 \% & 2.80 \% & 1.35 \% & 48 ms / & E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained
End-to-End Using
Intersection-over-Union Loss. CoRR 2019.\\
PGD-FCOS3D & & 1.38 \% & 2.81 \% & 1.20 \% & 0.03 s / 1 core & T. Wang, X. Zhu, J. Pang and D. Lin: Probabilistic and Geometric Depth:
Detecting Objects in Perspective. Conference on Robot Learning
(CoRL) 2021.\\
UNM3D & & 1.17 \% & 1.76 \% & 1.07 \% & na s / 1 core & \\
MM3DV2 & & 1.16 \% & 1.93 \% & 1.17 \% & NA s / 1 core & \\
CMAN & & 1.05 \% & 1.59 \% & 1.11 \% & 0.15 s / 1 core & C. Yuanzhouhan Cao: CMAN: Leaning Global Structure Correlation
for
Monocular 3D Object Detection. IEEE Trans. Intell. Transport. Syst. 2022.\\
MonoEF & & 0.92 \% & 1.80 \% & 0.71 \% & 0.03 s / 1 core & Y. Zhou, Y. He, H. Zhu, C. Wang, H. Li and Q. Jiang: Monocular 3D Object Detection: An
Extrinsic Parameter Free Approach. Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition (CVPR) 2021.\\
M3D-RPN & & 0.65 \% & 0.94 \% & 0.47 \% & 0.16 s / GPU & G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal
Network for Object Detection . ICCV 2019 .\\
MonoRUn & & 0.61 \% & 1.01 \% & 0.48 \% & 0.07 s / GPU & H. Chen, Y. Huang, W. Tian, Z. Gao and L. Xiong: MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.\\
Shift R-CNN (mono) & & 0.29 \% & 0.48 \% & 0.31 \% & 0.25 s / GPU & A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D
Object Detection With Closed-form Geometric
Constraints. ICIP 2019.\\
mBoW & la & 0.00 \% & 0.00 \% & 0.00 \% & 10 s / 1 core & J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using
a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International
Conference on Intelligent Robots and Systems
(IROS) 2013.
\end{tabular}