Bird's Eye View Evaluation 2017


The bird's eye view benchmark consists of 7481 training images and 7518 test images as well as the corresponding point clouds, comprising a total of 80.256 labeled objects. For evaluation, we compute precision-recall curves. To rank the methods we compute average precision. We require that all methods use the same parameter set for all test pairs. Our development kit provides details about the data format as well as MATLAB / C++ utility functions for reading and writing the label files.

We evaluate bird's eye view detection performance using the PASCAL criteria also used for 2D object detection. Far objects are thus filtered based on their bounding box height in the image plane. As only objects also appearing on the image plane are labeled, objects in don't car areas do not count as false positives. We note that the evaluation does not take care of ignoring detections that are not visible on the image plane — these detections might give rise to false positives. For cars we require a bounding box overlap of 70% in bird's eye view, while for pedestrians and cyclists we require an overlap of 50%. Difficulties are defined as follows:

  • Easy: Min. bounding box height: 40 Px, Max. occlusion level: Fully visible, Max. truncation: 15 %
  • Moderate: Min. bounding box height: 25 Px, Max. occlusion level: Partly occluded, Max. truncation: 30 %
  • Hard: Min. bounding box height: 25 Px, Max. occlusion level: Difficult to see, Max. truncation: 50 %

All methods are ranked based on the moderately difficult results.

Note 2: On 08.10.2019, we have followed the suggestions of the Mapillary team in their paper Disentangling Monocular 3D Object Detection and use 40 recall positions instead of the 11 recall positions proposed in the original Pascal VOC benchmark. This results in a more fair comparison of the results, please check their paper. The last leaderboards right before this change can be found here: Object Detection Evaluation, 3D Object Detection Evaluation, Bird's Eye View Evaluation.
Important Policy Update: As more and more non-published work and re-implementations of existing work is submitted to KITTI, we have established a new policy: from now on, only submissions with significant novelty that are leading to a peer-reviewed paper in a conference or journal are allowed. Minor modifications of existing algorithms or student research projects are not allowed. Such work must be evaluated on a split of the training set. To ensure that our policy is adopted, new users must detail their status, describe their work and specify the targeted venue during registration. Furthermore, we will regularly delete all entries that are 6 months old but are still anonymous or do not have a paper associated with them. For conferences, 6 month is enough to determine if a paper has been accepted and to add the bibliography information. For longer review cycles, you need to resubmit your results.
Additional information used by the methods
  • Stereo: Method uses left and right (stereo) images
  • Flow: Method uses optical flow (2 temporally adjacent images)
  • Multiview: Method uses more than 2 temporally adjacent images
  • Laser Points: Method uses point clouds from Velodyne laser scanner
  • Additional training data: Use of additional data sources for training (see details)

Car


Method Setting Code Moderate Easy Hard Runtime Environment
1 VirConv-S 93.52 % 95.99 % 90.38 % 0.09 s 1 core @ 2.5 Ghz (C/C++)
2 VirConv-T 92.65 % 96.11 % 89.69 % 0.09 s 1 core @ 2.5 Ghz (C/C++)
3 GraR-Po code 92.12 % 95.79 % 87.11 % 0.06 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.
4 TED 92.05 % 95.44 % 87.30 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, C. Wen, W. Li, R. Yang and C. Wang: Transformation-Equivariant 3D Object Detection for Autonomous Driving. AAAI 2023.
5 LIVOX_Det
This method makes use of Velodyne laser scans.
92.05 % 95.60 % 89.22 % n/a s 1 core @ 2.5 Ghz (Python + C/C++)
6 VirConv-L 91.95 % 95.53 % 87.07 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
7 VPFNet code 91.86 % 93.02 % 86.94 % 0.06 s 2 cores @ 2.5 Ghz (Python)
H. Zhu, J. Deng, Y. Zhang, J. Ji, Q. Mao, H. Li and Y. Zhang: VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion. IEEE Transactions on Multimedia 2022.
8 SFD code 91.85 % 95.64 % 86.83 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
X. Wu, L. Peng, H. Yang, L. Xie, C. Huang, C. Deng, H. Liu and D. Cai: Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion. CVPR 2022.
9 SE-SSD
This method makes use of Velodyne laser scans.
code 91.84 % 95.68 % 86.72 % 0.03 s 1 core @ 2.5 Ghz (Python + C/C++)
W. Zheng, W. Tang, L. Jiang and C. Fu: SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud. CVPR 2021.
10 GraR-Vo code 91.72 % 95.27 % 86.51 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.
11 PVT-SSD 91.63 % 95.23 % 86.43 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
12 CityBrainLab 91.62 % 94.78 % 86.68 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
13 SPANet 91.59 % 95.59 % 86.53 % 0.06 s 1 core @ 2.5 Ghz (C/C++)
Y. Ye: SPANet: Spatial and Part-Aware Aggregation Network for 3D Object Detection. Pacific Rim International Conference on Artificial Intelligence 2021.
14 CasA code 91.54 % 95.19 % 86.82 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
15 LoGoNet 91.52 % 95.48 % 87.09 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
16 GraR-Pi code 91.52 % 95.06 % 86.42 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.
17 HRNet 91.42 % 95.18 % 86.73 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
18 BiProDet 91.36 % 92.96 % 86.80 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
19 NSAW code 91.35 % 94.51 % 86.42 % 0.1 s 1 core @ 2.5 Ghz (Python)
20 BADet code 91.32 % 95.23 % 86.48 % 0.14 s 1 core @ 2.5 Ghz (C/C++)
R. Qian, X. Lai and X. Li: BADet: Boundary-Aware 3D Object Detection from Point Clouds. Pattern Recognition 2022.
21 VoCo 91.32 % 95.42 % 88.38 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
22 GT3D 91.31 % 95.05 % 86.67 % 0.1 s 1 core @ 2.5 Ghz (Python)
23 CasA++ code 91.22 % 94.57 % 88.43 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
24 Anonymous 91.14 % 94.04 % 86.33 % n/a s 1 core @ 2.5 Ghz (C/C++)
25 SGFusion 91.11 % 94.76 % 86.27 % 0.06 s 1 core @ 2.5 Ghz (C/C++)
ERROR: Wrong syntax in BIBTEX file.
26 SA-SSD code 91.03 % 95.03 % 85.96 % 0.04 s 1 core @ 2.5 Ghz (Python)
C. He, H. Zeng, J. Huang, X. Hua and L. Zhang: Structure Aware Single-stage 3D Object Detection from Point Cloud. CVPR 2020.
27 VGT-RCNN 90.89 % 94.59 % 86.36 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
28 3D Dual-Fusion 90.86 % 93.08 % 86.44 % 0.1 s 1 core @ 2.5 Ghz (Python)
29 Anonymous
This method makes use of Velodyne laser scans.
90.82 % 94.89 % 86.39 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
30 MMLab PV-RCNN
This method makes use of Velodyne laser scans.
code 90.65 % 94.98 % 86.14 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. CVPR 2020.
31 VPFNet code 90.52 % 93.94 % 86.25 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
C. Wang, H. Chen and L. Fu: VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection. 2021.
32 PDV code 90.48 % 94.56 % 86.23 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Hu, T. Kuai and S. Waslander: Point Density-Aware Voxels for LiDAR 3D Object Detection. CVPR 2022.
33 M3DeTR code 90.37 % 94.41 % 85.98 % n/a s GPU @ 1.0 Ghz (Python)
T. Guan, J. Wang, S. Lan, R. Chandra, Z. Wu, L. Davis and D. Manocha: M3DeTR: Multi-representation, Multi- scale, Mutual-relation 3D Object Detection with Transformers. 2021.
34 SGDA3D 90.36 % 92.53 % 86.09 % 0.07 s 1 core @ 2.5 Ghz (Python)
35 VoTr-TSD code 90.34 % 94.03 % 86.14 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu and C. Xu: Voxel Transformer for 3D Object Detection. ICCV 2021.
36 Under Blind Review#2 90.27 % 92.51 % 86.01 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
37 IKT3D
This method makes use of Velodyne laser scans.
90.23 % 94.22 % 85.94 % 0.05 s 1 core @ 2.5 Ghz (Python)
38 DSA-PV-RCNN
This method makes use of Velodyne laser scans.
code 90.13 % 92.42 % 85.93 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
P. Bhattacharyya, C. Huang and K. Czarnecki: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection. 2021.
39 XView 90.12 % 92.27 % 85.94 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
L. Xie, G. Xu, D. Cai and X. He: X-view: Non-egocentric Multi-View 3D Object Detector. 2021.
40 GraR-VoI code 90.10 % 95.69 % 86.85 % 0.07 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Yang, Z. Liu, X. Wu, W. Wang, W. Qian, X. He and D. Cai: Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph. ECCV 2022.
41 CAT-Det 90.07 % 92.59 % 85.82 % 0.3 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Zhang, J. Chen and D. Huang: CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection. CVPR 2022.
42 NIV-SSD 89.92 % 95.59 % 84.58 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
43 SVGA-Net 89.88 % 92.07 % 85.59 % 0.03s 1 core @ 2.5 Ghz (Python + C/C++)
Q. He, Z. Wang, H. Zeng, Y. Zeng and Y. Liu: SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds. AAAI 2022.
44 EBM3DOD code 89.86 % 95.64 % 84.56 % 0.12 s 1 core @ 2.5 Ghz (Python)
F. Gustafsson, M. Danelljan and T. Schön: Accurate 3D Object Detection using Energy- Based Models. arXiv preprint arXiv:2012.04634 2020.
45 CIA-SSD
This method makes use of Velodyne laser scans.
code 89.84 % 93.74 % 82.39 % 0.03 s 1 core @ 2.5 Ghz (Python + C/C++)
W. Zheng, W. Tang, S. Chen, L. Jiang and C. Fu: CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud. AAAI 2021.
46 CLOCs_PVCas code 89.80 % 93.05 % 86.57 % 0.1 s 1 core @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.
47 GLENet-VR 89.76 % 93.48 % 84.89 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Zhang, Z. Zhu, J. Hou and Y. Yuan: GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation. arXiv preprint arXiv:2207.02466 2022.
48 RDIoU code 89.75 % 94.90 % 84.67 % 0.03 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Sheng, S. Cai, N. Zhao, B. Deng, J. Huang, X. Hua, M. Zhao and G. Lee: Rethinking IoU-based Optimization for Single- stage 3D Object Detection. ECCV 2022.
49 HRNet++ 89.69 % 95.38 % 84.75 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
50 EBM3DOD baseline code 89.63 % 95.44 % 84.34 % 0.05 s 1 core @ 2.5 Ghz (Python)
F. Gustafsson, M. Danelljan and T. Schön: Accurate 3D Object Detection using Energy- Based Models. arXiv preprint arXiv:2012.04634 2020.
51 HCPVF 89.62 % 93.20 % 86.72 % 0.07 s 1 core @ 2.5 Ghz (Python + C/C++)
52 LightCPC code 89.62 % 92.99 % 86.51 % 0.02 s 1 core @ 2.5 Ghz (Python + C/C++)
53 3SNet 89.58 % 93.26 % 84.80 % 0.07 s GPU @ 2.5 Ghz (Python)
54 CAD 89.57 % 93.03 % 84.71 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
55 3D-CVF at SPA
This method makes use of Velodyne laser scans.
89.56 % 93.52 % 82.45 % 0.06 s 1 core @ 2.5 Ghz (C/C++)
J. Yoo, Y. Kim, J. Kim and J. Choi: 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection. ECCV 2020.
56 OcTr 89.56 % 93.08 % 86.74 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
57 ImpDet 89.55 % 92.74 % 84.41 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
58 Struc info fusion II 89.54 % 95.26 % 82.31 % 0.05 s GPU @ 2.5 Ghz (Python)
P. An, J. Liang, J. Ma, K. Yu and B. Fang: Struc info fusion. Submitted to CVIU 2021.
59 SASA
This method makes use of Velodyne laser scans.
code 89.51 % 92.87 % 86.35 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
C. Chen, Z. Chen, J. Zhang and D. Tao: SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection. arXiv preprint arXiv:2201.01976 2022.
60 PA-RCNN code 89.51 % 92.95 % 82.42 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
61 Fast-CLOCs 89.49 % 93.03 % 86.40 % 0.1 s GPU @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022.
62 IA-SSD (single) code 89.48 % 93.14 % 84.42 % 0.013 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
63 VoxelGraphRCNN 89.48 % 93.35 % 86.68 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
ERROR: Wrong syntax in BIBTEX file.
64 CLOCs code 89.48 % 92.91 % 86.42 % 0.1 s 1 core @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection . 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.
65 PA3DNet 89.46 % 93.11 % 84.60 % 0.05 s GPU @ 2.5 Ghz (Python)
66 DVF-V 89.42 % 93.12 % 86.50 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
A. Mahmoud, J. Hu and S. Waslander: Dense Voxel Fusion for 3D Object Detection. WACV 2023.
67 GS-FPS 89.39 % 92.69 % 86.16 % TBD s 1 core @ 2.5 Ghz (C/C++)
68 Struc info fusion I 89.38 % 94.91 % 84.29 % 0.05 s 1 core @ 2.5 Ghz (Python)
P. An, J. Liang, J. Ma, K. Yu and B. Fang: Struc info fusion. Submitted to CVIU 2021.
69 SWA code 89.36 % 92.82 % 86.21 % 0.18 s 1 core @ 2.5 Ghz (C/C++)
70 IPS 89.36 % 92.78 % 86.08 % TBD s 1 core @ 2.5 Ghz (C/C++)
71 DCGNN 89.36 % 94.57 % 84.13 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
72 ATT_SSD 89.34 % 92.58 % 86.08 % 0.01 s 1 core @ 2.5 Ghz (Python)
73 BtcDet
This method makes use of Velodyne laser scans.
code 89.34 % 92.81 % 84.55 % 0.09 s GPU @ 2.5 Ghz (Python + C/C++)
Q. Xu, Y. Zhong and U. Neumann: Behind the Curtain: Learning Occluded Shapes for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2022.
74 IA-SSD (multi) code 89.33 % 92.79 % 84.35 % 0.014 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
75 Anomynous 89.29 % 92.99 % 86.49 % 0.09 s 1 core @ 2.5 Ghz (C/C++)
76 Anonymous 89.27 % 92.79 % 86.53 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
77 GEO_LOC 89.25 % 92.80 % 84.23 % TBD s 1 core @ 2.5 Ghz (C/C++)
78 TBD 89.24 % 92.59 % 85.99 % 0.1 s 1 core @ 2.5 Ghz (Python)
79 KPSCC code 89.21 % 92.88 % 85.87 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
80 ACDet code 89.21 % 92.87 % 85.80 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
J. Xu, G. Wang, X. Zhang and G. Wan: ACDet: Attentive Cross-view Fusion for LiDAR-based 3D Object Detection. 3DV 2022.
81 DVF-PV 89.20 % 93.08 % 86.28 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
A. Mahmoud, J. Hu and S. Waslander: Dense Voxel Fusion for 3D Object Detection. WACV 2023.
82 TTT_SSD 89.20 % 92.55 % 86.07 % TBD s 1 core @ 2.5 Ghz (C/C++)
83 STD code 89.19 % 94.74 % 86.42 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia: STD: Sparse-to-Dense 3D Object Detector for Point Cloud. ICCV 2019.
84 GS-FPS-LT 89.18 % 92.74 % 84.17 % TBD s 1 core @ 2.5 Ghz (C/C++)
85 Point-GNN
This method makes use of Velodyne laser scans.
code 89.17 % 93.11 % 83.90 % 0.6 s GPU @ 2.5 Ghz (Python)
W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. CVPR 2020.
86 HMFI code 89.17 % 93.04 % 86.37 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
X. Li, B. Shi, Y. Hou, X. Wu, T. Ma, Y. Li and L. He: Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection. ECCV 2022.
87 SSL-PointGNN code 89.16 % 92.92 % 83.99 % 0.56 s GPU @ 1.5 Ghz (Python)
E. Erçelik, E. Yurtsever, M. Liu, Z. Yang, H. Zhang, P. Topçam, M. Listl, Y. Çaylı and A. Knoll: 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone. arXiv preprint arXiv:2205.00705 2022.
88 USVLab BSAODet 89.13 % 92.92 % 86.41 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
89 SPG_mini
This method makes use of Velodyne laser scans.
code 89.12 % 92.80 % 86.27 % 0.09 s GPU @ 2.5 Ghz (Python)
Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation. Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) 2021.
90 HPV-RCNN 89.12 % 92.49 % 83.98 % 0.15 s 1 core @ 2.5 Ghz (Python)
91 ITCA-SSD code 89.12 % 93.19 % 83.99 % 0.05 s 1 core @ 2.5 Ghz (Python)
92 PV-DT3D 89.10 % 92.65 % 86.43 % 1.4 s 1 core @ 2.5 Ghz (C/C++)
93 EQ-PVRCNN code 89.09 % 94.55 % 86.42 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, L. Jiang, Y. Sun, B. Schiele and J. Jia: A Unified Query-based Paradigm for Point Cloud Understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.
94 SPT 89.09 % 94.87 % 84.38 % 0.1 s GPU @ 2.5 Ghz (Python)
95 TBD code 89.09 % 92.61 % 83.85 % 0.1 s GPU @ 2.5 Ghz (Python)
96 VoxSeT code 89.07 % 92.70 % 86.29 % 33 ms 1 core @ 2.5 Ghz (C/C++)
C. He, R. Li, S. Li and L. Zhang: Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds. CVPR 2022.
97 ChTR3D 89.04 % 92.72 % 86.29 % 0.06 s 1 core @ 2.5 Ghz (Python + C/C++)
98 3DSSD code 89.02 % 92.66 % 85.86 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu and J. Jia: 3DSSD: Point-based 3D Single Stage Object Detector. CVPR 2020.
99 EPNet++ 89.00 % 95.41 % 85.73 % 0.1 s GPU @ 2.5 Ghz (Python)
Z. Liu, H. tengteng, B. Li, X. Chen, X. Wang and X. Bai: EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection. arXiv preprint arXiv:2112.11088 2021.
100 Focals Conv code 89.00 % 92.67 % 86.33 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
Y. Chen, Y. Li, X. Zhang, J. Sun and J. Jia: Focal Sparse Convolutional Networks for 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.
101 LGNet 88.98 % 92.83 % 86.26 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
102 ChTR3D 88.98 % 92.35 % 86.17 % 0.06 s 1 core @ 2.5 Ghz (Python + C/C++)
103 VGRCNN++ 88.96 % 92.96 % 86.25 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
104 PTA-RCNN 88.94 % 92.32 % 85.63 % 0.08 s 1 core @ 2.5 Ghz (Python)
105 GV-RCNN code 88.94 % 94.52 % 86.24 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
106 SPNet code 88.92 % 92.29 % 86.16 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
107 AGS-SSD[la] 88.90 % 92.51 % 85.96 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
108 BSConv 88.88 % 92.49 % 85.65 % 0.1 s 1 core @ 2.5 Ghz (Java)
109 CZY_PPF_Net2 88.88 % 94.68 % 86.15 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
110 H^23D R-CNN code 88.87 % 92.85 % 86.07 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
J. Deng, W. Zhou, Y. Zhang and H. Li: From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2021.
111 ChTR3D 88.85 % 92.58 % 85.98 % 0.06 s 1 core @ 2.5 Ghz (Python + C/C++)
112 Pyramid R-CNN 88.84 % 92.19 % 86.21 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
J. Mao, M. Niu, H. Bai, X. Liang, H. Xu and C. Xu: Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection. ICCV 2021.
113 CityBrainLab-CT3D code 88.83 % 92.36 % 84.07 % 0.07 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Sheng, S. Cai, Y. Liu, B. Deng, J. Huang, X. Hua and M. Zhao: Improving 3D Object Detection with Channel- wise Transformer. ICCV 2021.
114 Voxel R-CNN code 88.83 % 94.85 % 86.13 % 0.04 s GPU @ 3.0 Ghz (C/C++)
J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang and H. Li: Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection . AAAI 2021.
115 HVNet 88.82 % 92.83 % 83.38 % 0.03 s GPU @ 2.0 Ghz (Python)
M. Ye, S. Xu and T. Cao: HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection. CVPR 2020.
116 VG-RCNN 88.81 % 92.75 % 86.12 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
117 GLENet 88.81 % 92.22 % 84.13 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
118 FV2P v2 88.80 % 92.22 % 84.24 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
119 mbdf-netv1 code 88.77 % 94.45 % 83.90 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
120 BASA 88.76 % 92.72 % 83.71 % 1s 1 core @ 2.5 Ghz (python)
121 CZY_3917 88.71 % 94.23 % 86.01 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
122 SPG
This method makes use of Velodyne laser scans.
code 88.70 % 94.33 % 85.98 % 0.09 s 1 core @ 2.5 Ghz (Python + C/C++)
Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation. Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) 2021.
123 MVMM code 88.70 % 92.17 % 85.47 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
124 VGRCNN 88.69 % 92.58 % 86.02 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
125 DTE3D 88.69 % 92.61 % 85.77 % 0.19 s 1 core @ 2.5 Ghz (C/C++)
126 DCAN-Second code 88.68 % 92.76 % 85.32 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
127 PSA-SSD 88.65 % 92.21 % 83.75 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
128 SIENet code 88.65 % 92.38 % 86.03 % 0.08 s 1 core @ 2.5 Ghz (Python)
Z. Li, Y. Yao, Z. Quan, W. Yang and J. Xie: SIENet: Spatial Information Enhancement Network for 3D Object Detection from Point Cloud. 2021.
129 CZY_PPF_Net 88.65 % 92.78 % 85.83 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
130 P2V-RCNN 88.63 % 92.72 % 86.14 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, S. Luo, Z. Zhu, H. Dai, A. Krylov, Y. Ding and L. Shao: P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds. IEEE Access 2021.
131 FromVoxelToPoint code 88.61 % 92.23 % 86.11 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, H. Dai, L. Shao and Y. Ding: From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to- Point Decoder. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
132 RangeIoUDet
This method makes use of Velodyne laser scans.
88.59 % 92.28 % 85.83 % 0.02 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liang, Z. Zhang, M. Zhang, X. Zhao and S. Pu: RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union. CVPR 2021.
133 WGVRF 88.56 % 92.45 % 85.69 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
134 DCCA 88.55 % 92.29 % 85.85 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
135 GVNet-V2 88.54 % 92.26 % 85.71 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
136 VGA-RCNN 88.53 % 92.37 % 85.77 % 0.07 s 1 core @ 2.5 Ghz (Python)
137 TVTr 88.51 % 94.30 % 85.80 % 0.08 s 1 core @ 2.5 Ghz (Python)
138 Anonymous 88.49 % 92.40 % 85.77 % 0.03s
139 EPNet code 88.47 % 94.22 % 83.69 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
T. Huang, Z. Liu, X. Chen and X. Bai: EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection. ECCV 2020.
140 CenterNet3D 88.46 % 91.80 % 83.62 % 0.04 s GPU @ 1.5 Ghz (Python)
G. Wang, B. Tian, Y. Ai, T. Xu, L. Chen and D. Cao: CenterNet3D:An Anchor free Object Detector for Autonomous Driving. 2020.
141 FARP-Net code 88.45 % 91.20 % 86.01 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
142 GVNet code 88.43 % 92.19 % 85.63 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
143 USVLab BSAODet (S) 88.42 % 92.19 % 85.55 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
144 DGT-Det3D code 88.41 % 92.57 % 85.50 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
145 Semantical PVRCNN 88.41 % 92.71 % 85.86 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
146 PVE 88.40 % 92.49 % 85.79 % 0.3 s 1 core @ 2.5 Ghz (C/C++)
147 RangeRCNN
This method makes use of Velodyne laser scans.
88.40 % 92.15 % 85.74 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liang, M. Zhang, Z. Zhang, X. Zhao and S. Pu: RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range Image Representation. arXiv preprint arXiv:2009.00206 2020.
148 Patches
This method makes use of Velodyne laser scans.
88.39 % 92.72 % 83.19 % 0.15 s GPU @ 2.0 Ghz
J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.
149 3D IoU-Net 88.38 % 94.76 % 81.93 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, S. Luo, Z. Zhu, H. Dai, S. Krylov, Y. Ding and L. Shao: 3D IoU-Net: IoU Guided 3D Object Detector for Point Clouds. arXiv preprint arXiv:2004.04962 2020.
150 StructuralIF 88.38 % 91.78 % 85.67 % 0.02 s 8 cores @ 2.5 Ghz (Python)
J. Pei An: Deep structural information fusion for 3D object detection on LiDAR-camera system. Accepted in CVIU 2021.
151 CSVoxel-RCNN 88.37 % 92.07 % 85.51 % 0.03 s GPU @ 1.0 Ghz (Python)
152 VPNet 88.37 % 92.11 % 85.63 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
153 NV-RCNN 88.36 % 91.41 % 85.72 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
154 FSFNet 88.35 % 94.88 % 83.58 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
155 DKDet 88.32 % 92.21 % 85.46 % 0.03 s GPU @ 2.5 Ghz (Python + C/C++)
156 CenterFuse 88.31 % 91.54 % 83.39 % 0.059 sec/frame 2 x V100
157 TBD 88.26 % 91.44 % 85.44 % 0.06 s GPU @ 2.5 Ghz (Python)
158 KPP3D code 88.25 % 93.93 % 83.26 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
159 CLOCs_SecCas 88.23 % 91.16 % 82.63 % 0.1 s 1 core @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.
160 SPVB-SSD 88.23 % 91.82 % 85.46 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
161 U_SECOND_V4 88.22 % 91.95 % 85.03 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
162 U_RVRCNN_V2_1 88.21 % 92.05 % 85.39 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
163 UberATG-MMF
This method makes use of Velodyne laser scans.
88.21 % 93.67 % 81.99 % 0.08 s GPU @ 2.5 Ghz (Python)
M. Liang*, B. Yang*, Y. Chen, R. Hu and R. Urtasun: Multi-Task Multi-Sensor Fusion for 3D Object Detection. CVPR 2019.
164 Patches - EMP
This method makes use of Velodyne laser scans.
88.17 % 94.49 % 84.75 % 0.5 s GPU @ 2.5 Ghz (Python)
J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B. Nessler and S. Hochreiter: Patch Refinement: Localized 3D Object Detection. arXiv preprint arXiv:1910.04093 2019.
165 SRDL 88.17 % 92.01 % 85.43 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
ERROR: Wrong syntax in BIBTEX file.
166 CF-cd-io-tv 88.16 % 91.32 % 83.26 % 1 s 1 core @ 2.5 Ghz (C/C++)
167 PSA-Det3D 88.13 % 92.08 % 85.35 % 0.1 s GPU @ 2.5 Ghz (Python)
168 PVRCNN_8369 88.13 % 91.91 % 85.40 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
169 PointPainting
This method makes use of Velodyne laser scans.
88.11 % 92.45 % 83.36 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object Detection. CVPR 2020.
170 SERCNN
This method makes use of Velodyne laser scans.
88.10 % 94.11 % 83.43 % 0.1 s 1 core @ 2.5 Ghz (Python)
D. Zhou, J. Fang, X. Song, L. Liu, J. Yin, Y. Dai, H. Li and R. Yang: Joint 3D Instance Segmentation and Object Detection for Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020.
171 Associate-3Ddet code 88.09 % 91.40 % 82.96 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
L. Du, X. Ye, X. Tan, J. Feng, Z. Xu, E. Ding and S. Wen: Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
172 HotSpotNet 88.09 % 94.06 % 83.24 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille: object as hotspots. Proceedings of the European Conference on Computer Vision (ECCV) 2020.
173 Faraway-Frustum
This method makes use of Velodyne laser scans.
code 88.08 % 91.90 % 85.35 % 0.1 s GPU @ 2.5 Ghz (Python)
H. Zhang, D. Yang, E. Yurtsever, K. Redmill and U. Ozguner: Faraway-frustum: Dealing with lidar sparsity for 3D object detection using fusion. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) 2021.
174 TBD 88.04 % 91.31 % 84.79 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
175 SC-Voxel-RCNN 88.02 % 91.45 % 85.22 % 0.12 s GPU @ 1.0 Ghz (Python)
176 CZY 88.00 % 91.85 % 85.22 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
177 UberATG-HDNET
This method makes use of Velodyne laser scans.
87.98 % 93.13 % 81.23 % 0.05 s GPU @ 2.5 Ghz (Python)
B. Yang, M. Liang and R. Urtasun: HDNET: Exploiting HD Maps for 3D Object Detection. 2nd Conference on Robot Learning (CoRL) 2018.
178 Anonymous 87.96 % 91.52 % 82.99 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
179 TCDVF 87.94 % 91.21 % 84.66 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
180 DGT-Det3D 87.88 % 91.70 % 85.14 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
181 Fast Point R-CNN
This method makes use of Velodyne laser scans.
87.84 % 90.87 % 80.52 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Chen, S. Liu, X. Shen and J. Jia: Fast Point R-CNN. Proceedings of the IEEE international conference on computer vision (ICCV) 2019.
182 CSNet 87.84 % 92.23 % 82.93 % 0.1 s 1 core @ 2.5 Ghz (Python)
183 CF-ctdep-tv-ta 87.81 % 90.73 % 84.97 % 1 s 1 core @ 2.5 Ghz (C/C++)
184 Anonymous 87.80 % 91.58 % 82.86 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
185 MMLab-PartA^2
This method makes use of Velodyne laser scans.
code 87.79 % 91.70 % 84.61 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.
186 cp-tv-kp-io-sc 87.78 % 90.98 % 84.04 % 1 s 1 core @ 2.5 Ghz (C/C++)
187 SIF 87.76 % 91.44 % 85.15 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
P. An: SIF. Submitted to CVIU 2021.
188 U_PVRCNN_V2 87.74 % 91.62 % 85.03 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
189 MVAF-Net code 87.73 % 91.95 % 85.00 % 0.06 s 1 core @ 2.5 Ghz (Python + C/C++)
G. Wang, B. Tian, Y. Zhang, L. Chen, D. Cao and J. Wu: Multi-View Adaptive Fusion Network for 3D Object Detection. arXiv preprint arXiv:2011.00652 2020.
190 Reprod-Two-Branch 87.69 % 90.69 % 84.72 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
191 DKAnet 87.68 % 91.07 % 84.03 % 0.05 s 1 core @ 2.0 Ghz (Python)
192 DVFENet 87.68 % 90.93 % 84.60 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang: DVFENet: Dual-branch Voxel Feature Extraction Network for 3D Object Detection. Neurocomputing 2021.
193 S-AT GCN 87.68 % 90.85 % 84.20 % 0.02 s GPU @ 2.0 Ghz (Python)
L. Wang, C. Wang, X. Zhang, T. Lan and J. Li: S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection. CoRR 2021.
194 TBD 87.67 % 91.02 % 82.42 % 0.1 s 1 core @ 2.5 Ghz (Python)
195 CFF-tv-v2 87.67 % 90.70 % 84.58 % 1 s 1 core @ 2.5 Ghz (C/C++)
196 RangeDet (Official) code 87.67 % 90.93 % 82.92 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
L. Fan, X. Xiong, F. Wang, N. Wang and Z. Zhang: RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.
197 CFF-ep25 87.66 % 90.60 % 84.71 % 1 s 1 core @ 2.5 Ghz (C/C++)
198 Anonymous 87.64 % 91.40 % 82.97 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
199 TBD 87.62 % 90.86 % 82.29 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
200 CF-base-tv 87.60 % 90.28 % 84.52 % 1 s 1 core @ 2.5 Ghz (C/C++)
201 KeyFuse2B 87.59 % 90.70 % 84.58 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
202 MODet
This method makes use of Velodyne laser scans.
87.56 % 90.80 % 82.69 % 0.05 s GTX1080Ti
Y. Zhang, Z. Xiang, C. Qiao and S. Chen: Accurate and Real-Time Object Detection Based on Bird's Eye View on 3D Point Clouds. 2019 International Conference on 3D Vision (3DV) 2019.
203 CFF-tv 87.55 % 90.56 % 84.59 % 1 s 1 core @ 2.5 Ghz (C/C++)
204 cff-tv-v2-ep25 87.55 % 90.26 % 84.53 % 1 s 1 core @ 2.5 Ghz (C/C++)
205 AB3DMOT
This method makes use of Velodyne laser scans.
This is an online method (no batch processing).
code 87.53 % 91.99 % 81.03 % 0.0047s 1 core @ 2.5 Ghz (Python)
X. Weng and K. Kitani: A Baseline for 3D Multi-Object Tracking. arXiv:1907.03961 2019.
206 TBD 87.51 % 90.76 % 80.15 % 0.1 s 1 core @ 2.5 Ghz (Python)
207 DTFI 87.51 % 91.01 % 84.25 % 0.03 s 1 core @ 2.5 Ghz (Python)
208 CF-ctdep-tv 87.50 % 90.56 % 84.65 % 1 s 1 core @ 2.5 Ghz (C/C++)
209 PointRGCN 87.49 % 91.63 % 80.73 % 0.26 s GPU @ V100 (Python)
J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.
210 Anonymous 87.48 % 90.98 % 84.22 % 1 1 core @ 2.5 Ghz (Python)
211 SECOND_7862 87.48 % 90.98 % 84.22 % 1 s 1 core @ 2.5 Ghz (Python)
212 MGAF-3DSSD code 87.47 % 92.70 % 82.19 % 0.1 s 1 core @ 2.5 Ghz (Python)
J. Li, H. Dai, L. Shao and Y. Ding: Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
213 PC-CNN-V2
This method makes use of Velodyne laser scans.
87.40 % 91.19 % 79.35 % 0.5 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Du, M. Ang, S. Karaman and D. Rus: A General Pipeline for 3D Detection of Vehicles. 2018 IEEE International Conference on Robotics and Automation (ICRA) 2018.
214 PVTr 87.39 % 91.21 % 84.77 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
215 MMLab-PointRCNN
This method makes use of Velodyne laser scans.
code 87.39 % 92.13 % 82.72 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.
216 Sem-Aug 87.37 % 93.35 % 82.43 % 0.08 s GPU @ 2.5 Ghz (Python)
217 MAFF-Net(DAF-Pillar) 87.34 % 90.79 % 77.66 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
Z. Zhang, Z. Liang, M. Zhang, X. Zhao, Y. Ming, T. Wenming and S. Pu: MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion. arXiv preprint arXiv:2009.10945 2020.
218 KeyPoint-IoUHead 87.32 % 90.36 % 83.23 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
219 Harmonic PointPillar code 87.28 % 90.89 % 82.54 % 0.01 s 1 core @ 2.5 Ghz (Python)
H. Zhang, M. Mekala, Z. Nain, J. Park and H. Jung: Harmonic 3D: Time-friendly and Task- consistent LiDAR-based 3D Object Detection on Edge. will submit to IEEE Transactions on Intelligent Transportation Systems 2022.
220 ZMMPP 87.25 % 90.47 % 82.42 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
221 HRI-VoxelFPN 87.21 % 92.75 % 79.82 % 0.02 s GPU @ 2.5 Ghz (Python + C/C++)
H. Kuang, B. Wang, J. An, M. Zhang and Z. Zhang: Voxel-FPN:multi-scale voxel feature aggregation in 3D object detection from point clouds. sensors 2020.
222 epBRM
This method makes use of Velodyne laser scans.
code 87.13 % 90.70 % 81.92 % 0.1 s GPU @ >3.5 Ghz (Python + C/C++)
K. Shin: Improving a Quality of 3D Object Detection by Spatial Transformation Mechanism. arXiv preprint arXiv:1910.04853 2019.
223 T_PVRCNN 86.97 % 91.63 % 82.20 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
224 SARPNET 86.92 % 92.21 % 81.68 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
Y. Ye, H. Chen, C. Zhang, X. Hao and Z. Zhang: SARPNET: Shape Attention Regional Proposal Network for LiDAR-based 3D Object Detection. Neurocomputing 2019.
225 cff-tv-t 86.92 % 91.04 % 80.46 % 1 s 1 core @ 2.5 Ghz (C/C++)
226 CF-base-train 86.88 % 90.03 % 83.16 % 1 s 1 core @ 2.5 Ghz (C/C++)
227 Self-Calib Conv 86.86 % 90.00 % 83.88 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
228 T_PVRCNN_V2 86.85 % 91.54 % 81.82 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
229 ARPNET 86.81 % 90.06 % 79.41 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network for 3D object detection. Science China Information Sciences 2019.
230 C-GCN 86.78 % 91.11 % 80.09 % 0.147 s GPU @ V100 (Python)
J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.
231 IoU-2B 86.74 % 90.92 % 80.40 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
232 cp-tv-kp 86.58 % 89.58 % 83.64 % 1 s 1 core @ 2.5 Ghz (C/C++)
233 CAD
This method uses stereo information.
This method makes use of Velodyne laser scans.
86.56 % 90.00 % 81.62 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
234 PointPillars
This method makes use of Velodyne laser scans.
code 86.56 % 90.07 % 82.81 % 16 ms 1080ti GPU and Intel i7 CPU
A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from Point Clouds. CVPR 2019.
235 TANet code 86.54 % 91.58 % 81.19 % 0.035s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from Point Clouds with Triple Attention. AAAI 2020.
236 cp-tv 86.52 % 89.55 % 83.45 % 1 s 1 core @ 2.5 Ghz (C/C++)
237 SCNet
This method makes use of Velodyne laser scans.
86.48 % 90.07 % 81.30 % 0.04 s GPU @ 3.0 Ghz (Python)
Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.
238 CF-ctdep-train 86.46 % 89.57 % 82.03 % 1 s 1 core @ 2.5 Ghz (C/C++)
239 CSNet8306 code 86.44 % 92.57 % 81.36 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
240 SegVoxelNet 86.37 % 91.62 % 83.04 % 0.04 s 1 core @ 2.5 Ghz (Python)
H. Yi, S. Shi, M. Ding, J. Sun, K. Xu, H. Zhou, Z. Wang, S. Li and G. Wang: SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud. ICRA 2020.
241 Dune-DCF-e09 86.36 % 89.33 % 81.77 % 1 s 1 core @ 2.5 Ghz (C/C++)
242 Dune-DCF-e11 86.32 % 89.32 % 81.78 % 1 s 1 core @ 2.5 Ghz (C/C++)
243 3D IoU Loss
This method makes use of Velodyne laser scans.
86.22 % 91.36 % 81.20 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
D. Zhou, J. Fang, X. Song, C. Guan, J. Yin, Y. Dai and R. Yang: IoU Loss for 2D/3D Object Detection. International Conference on 3D Vision (3DV) 2019.
244 Dune-DCF-e15 86.21 % 88.99 % 81.62 % 1 s 1 core @ 2.5 Ghz (C/C++)
245 CrazyTensor-CF 86.10 % 89.13 % 81.61 % 1 s 1 core @ 2.5 Ghz (C/C++)
246 City-CF-fixed 86.09 % 89.94 % 81.73 % 1 s 1 core @ 2.5 Ghz (C/C++)
247 R-GCN 86.05 % 91.91 % 81.05 % 0.16 s GPU @ 2.5 Ghz (Python)
J. Zarzar, S. Giancola and B. Ghanem: PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement. ArXiv 2019.
248 UberATG-PIXOR++
This method makes use of Velodyne laser scans.
86.01 % 93.28 % 80.11 % 0.035 s GPU @ 2.5 Ghz (Python)
B. Yang, M. Liang and R. Urtasun: HDNET: Exploiting HD Maps for 3D Object Detection. 2nd Conference on Robot Learning (CoRL) 2018.
249 CAT 85.97 % 91.48 % 80.93 % 1 s 1 core @ 2.5 Ghz (Python)
250 SSL_PP code 85.93 % 92.19 % 80.40 % 16ms GPU @ 1.5 Ghz (Python)
251 CSNet8299 code 85.91 % 91.64 % 80.95 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
252 Sem-Aug-PointRCNN++ 85.88 % 91.68 % 83.37 % 0.1 s 8 cores @ 3.0 Ghz (Python)
253 DASS 85.85 % 91.74 % 80.97 % 0.09 s 1 core @ 2.0 Ghz (Python)
O. Unal, L. Van Gool and D. Dai: Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021.
254 F-ConvNet
This method makes use of Velodyne laser scans.
code 85.84 % 91.51 % 76.11 % 0.47 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. IROS 2019.
255 City-CF 85.83 % 89.20 % 81.61 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
256 PI-RCNN 85.81 % 91.44 % 81.00 % 0.1 s 1 core @ 2.5 Ghz (Python)
L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai and X. He: PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module. AAAI 2020 : The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020.
257 LazyTorch-CP-Infer-O 85.74 % 89.19 % 81.35 % 1 s 1 core @ 2.5 Ghz (C/C++)
258 PointRGBNet 85.73 % 91.39 % 80.68 % 0.08 s 4 cores @ 2.5 Ghz (Python + C/C++)
P. Xie Desheng: Real-time Detection of 3D Objects Based on Multi-Sensor Information Fusion. Automotive Engineering 2022.
259 AFTD 85.63 % 90.61 % 82.28 % 1 s 1 core @ 2.5 Ghz (Python + C/C++)
260 LazyTorch-CP-Small-P 85.63 % 89.10 % 81.27 % 1 s 1 core @ 2.5 Ghz (C/C++)
261 CrazyTensor-CP 85.55 % 87.94 % 82.63 % 1 s 1 core @ 2.5 Ghz (Python)
262 variance_point 85.39 % 91.90 % 81.13 % 0.05 s 1 core @ 2.5 Ghz (Python)
263 UberATG-ContFuse
This method makes use of Velodyne laser scans.
85.35 % 94.07 % 75.88 % 0.06 s GPU @ 2.5 Ghz (Python)
M. Liang, B. Yang, S. Wang and R. Urtasun: Deep Continuous Fusion for Multi-Sensor 3D Object Detection. ECCV 2018.
264 new_stereo 85.24 % 90.74 % 82.10 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
265 PSM_stereo 85.12 % 90.26 % 80.21 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
266 PFF3D
This method makes use of Velodyne laser scans.
code 85.08 % 89.61 % 80.42 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
L. Wen and K. Jo: Fast and Accurate 3D Object Detection for Lidar-Camera-Based Autonomous Vehicles Using One Shared Voxel-Based Backbone. IEEE Access 2021.
267 CenterPoint (pcdet) 85.05 % 88.47 % 81.19 % 0.051 sec/frame 2 x V100
268 AVOD
This method makes use of Velodyne laser scans.
code 84.95 % 89.75 % 78.32 % 0.08 s Titan X (pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
269 WS3D
This method makes use of Velodyne laser scans.
84.93 % 90.96 % 77.96 % 0.1 s GPU @ 2.5 Ghz (Python)
Q. Meng, W. Wang, T. Zhou, J. Shen, L. Van Gool and D. Dai: Weakly Supervised 3D Object Detection from Lidar Point Cloud. 2020.
270 AVOD-FPN
This method makes use of Velodyne laser scans.
code 84.82 % 90.99 % 79.62 % 0.1 s Titan X (Pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
271 MF 84.72 % 88.58 % 78.17 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
272 F-PointNet
This method makes use of Velodyne laser scans.
code 84.67 % 91.17 % 74.77 % 0.17 s GPU @ 3.0 Ghz (Python)
C. Qi, W. Liu, C. Wu, H. Su and L. Guibas: Frustum PointNets for 3D Object Detection from RGB-D Data. arXiv preprint arXiv:1711.08488 2017.
273 3DBN
This method makes use of Velodyne laser scans.
83.94 % 89.66 % 76.50 % 0.13s 1080Ti (Python+C/C++)
X. Li, J. Guivant, N. Kwok and Y. Xu: 3D Backbone Network for 3D Object Detection. CoRR 2019.
274 MLOD
This method makes use of Velodyne laser scans.
code 82.68 % 90.25 % 77.97 % 0.12 s GPU @ 1.5 Ghz (Python)
J. Deng and K. Czarnecki: MLOD: A multi-view 3D object detection based on robust feature fusion method. arXiv preprint arXiv:1909.04163 2019.
275 BirdNet+
This method makes use of Velodyne laser scans.
code 81.85 % 87.43 % 75.36 % 0.11 s Titan Xp (PyTorch)
A. Barrera, J. Beltrán, C. Guindel, J. Iglesias and F. García: BirdNet+: Two-Stage 3D Object Detection in LiDAR through a Sparsity-Invariant Bird’s Eye View. IEEE Access 2021.
276 TBD 81.53 % 87.90 % 74.26 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
277 FD 81.47 % 88.34 % 75.07 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
278 CZY 81.21 % 89.10 % 76.13 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
279 DMF
This method uses stereo information.
80.29 % 84.64 % 76.05 % 0.2 s 1 core @ 2.5 Ghz (Python + C/C++)
X. J. Chen and W. Xu: Disparity-Based Multiscale Fusion Network for Transportation Detection. IEEE Transactions on Intelligent Transportation Systems 2022.
280 UberATG-PIXOR
This method makes use of Velodyne laser scans.
80.01 % 83.97 % 74.31 % 0.035 s TITAN Xp (Python)
B. Yang, W. Luo and R. Urtasun: PIXOR: Real-time 3D Object Detection from Point Clouds. CVPR 2018.
281 MV3D (LIDAR)
This method makes use of Velodyne laser scans.
78.98 % 86.49 % 72.23 % 0.24 s GPU @ 2.5 Ghz (Python + C/C++)
X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.
282 DSGN++
This method uses stereo information.
code 78.94 % 88.55 % 69.74 % 0.2 s GeForce RTX 2080Ti
Y. Chen, S. Huang, S. Liu, B. Yu and J. Jia: DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors. arXiv preprint arXiv:2204.03039 2022.
283 MV3D
This method makes use of Velodyne laser scans.
78.93 % 86.62 % 69.80 % 0.36 s GPU @ 2.5 Ghz (Python + C/C++)
X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.
284 StereoDistill 78.59 % 89.03 % 69.34 % 0.4 s 1 core @ 2.5 Ghz (Python)
285 Anonymous 77.40 % 90.76 % 70.00 % 0.5 s 1 core @ 2.5 Ghz (C/C++)
286 MMLAB LIGA-Stereo
This method uses stereo information.
code 76.78 % 88.15 % 67.40 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
X. Guo, S. Shi, X. Wang and H. Li: LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.
287 RCD 75.83 % 82.26 % 69.61 % 0.1 s GPU @ 2.5 Ghz (Python)
A. Bewley, P. Sun, T. Mensink, D. Anguelov and C. Sminchisescu: Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection. Conference on Robot Learning (CoRL) 2020.
288 LaserNet 74.52 % 79.19 % 68.45 % 12 ms GPU @ 2.5 Ghz (C/C++)
G. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez and C. Wellington: LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.
289 PL++ (SDN+GDC)
This method uses stereo information.
This method makes use of Velodyne laser scans.
code 73.80 % 84.61 % 65.59 % 0.6 s GPU @ 2.5 Ghz (C/C++)
Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.
290 SNVC
This method uses stereo information.
code 73.61 % 86.88 % 64.49 % 1 s GPU @ 1.0 Ghz (Python)
S. Li, Z. Liu, Z. Shen and K. Cheng: Stereo Neural Vernier Caliper. Proceedings of the AAAI Conference on Artificial Intelligence 2022.
291 A3DODWTDA
This method makes use of Velodyne laser scans.
code 73.26 % 79.58 % 62.77 % 0.08 s GPU @ 3.0 Ghz (Python)
F. Gustafsson and E. Linder-Norén: Automotive 3D Object Detection Without Target Domain Annotations. 2018.
292 Anonymous 71.23 % 86.67 % 64.08 % 0.5 s 1 core @ 2.5 Ghz (C/C++)
293 Complexer-YOLO
This method makes use of Velodyne laser scans.
68.96 % 77.24 % 64.95 % 0.06 s GPU @ 3.5 Ghz (C/C++)
M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019.
294 Pseudo-Stereo++ 68.36 % 84.64 % 59.01 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
295 TopNet-Retina
This method makes use of Velodyne laser scans.
68.16 % 80.16 % 63.43 % 52ms GeForce 1080Ti (tensorflow-gpu, v1.12)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
296 CG-Stereo
This method uses stereo information.
66.44 % 85.29 % 58.95 % 0.57 s GeForce RTX 2080 Ti
C. Li, J. Ku and S. Waslander: Confidence Guided Stereo 3D Object Detection with Split Depth Estimation. IROS 2020.
297 PLUME
This method uses stereo information.
66.27 % 82.97 % 56.70 % 0.15 s GPU @ 2.5 Ghz (Python)
Y. Wang, B. Yang, R. Hu, M. Liang and R. Urtasun: PLUME: Efficient 3D Object Detection from Stereo Images. IROS 2021.
298 CDN
This method uses stereo information.
code 66.24 % 83.32 % 57.65 % 0.6 s GPU @ 2.5 Ghz (Python)
D. Garg, Y. Wang, B. Hariharan, M. Campbell, K. Weinberger and W. Chao: Wasserstein Distances for Stereo Disparity Estimation. Advances in Neural Information Processing Systems (NeurIPS) 2020.
299 PS 65.33 % 83.75 % 56.14 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
300 DSGN
This method uses stereo information.
code 65.05 % 82.90 % 56.60 % 0.67 s NVIDIA Tesla V100
Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D Object Detection. CVPR 2020.
301 TopNet-DecayRate
This method makes use of Velodyne laser scans.
64.60 % 79.74 % 58.04 % 92 ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
302 UPF_3D
This method uses stereo information.
63.58 % 85.53 % 56.56 % 0.29 s 1 core @ 2.5 Ghz (Python)
303 BirdNet+ (legacy)
This method makes use of Velodyne laser scans.
code 63.33 % 84.80 % 61.23 % 0.1 s Titan Xp (PyTorch)
A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird’s Eye View. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
304 3D FCN
This method makes use of Velodyne laser scans.
61.67 % 70.62 % 55.61 % >5 s 1 core @ 2.5 Ghz (C/C++)
B. Li: 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. IROS 2017.
305 CDN-PL++
This method uses stereo information.
61.04 % 81.27 % 52.84 % 0.4 s GPU @ 2.5 Ghz (C/C++)
D. Garg, Y. Wang, B. Hariharan, M. Campbell, K. Weinberger and W. Chao: Wasserstein Distances for Stereo Disparity Estimation. Advances in Neural Information Processing Systems 2020.
306 BirdNet
This method makes use of Velodyne laser scans.
59.83 % 84.17 % 57.35 % 0.11 s Titan Xp (Caffe)
J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework from LiDAR Information. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
307 TopNet-UncEst
This method makes use of Velodyne laser scans.
59.67 % 72.05 % 51.67 % 0.09 s NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, M. Braun, M. Lauer and C. Stiller: Capturing Object Detection Uncertainty in Multi-Layer Grid Maps. 2019.
308 RT3D-GMP
This method uses stereo information.
59.00 % 69.14 % 45.49 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
H. Königshof and C. Stiller: Learning-Based Shape Estimation with Grid Map Patches for Realtime 3D Object Detection for Automated Driving. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
309 Disp R-CNN (velo)
This method uses stereo information.
code 58.62 % 79.76 % 47.73 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
310 ESGN
This method uses stereo information.
58.12 % 78.10 % 49.28 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
A. Gao, Y. Pang, J. Nie, Z. Shao, J. Cao, Y. Guo and X. Li: ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2022.
311 Pseudo-LiDAR++
This method uses stereo information.
code 58.01 % 78.31 % 51.25 % 0.4 s GPU @ 2.5 Ghz (Python)
Y. You, Y. Wang, W. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving. International Conference on Learning Representations 2020.
312 Disp R-CNN
This method uses stereo information.
code 57.98 % 79.61 % 47.09 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
313 ZoomNet
This method uses stereo information.
code 54.91 % 72.94 % 44.14 % 0.3 s 1 core @ 2.5 Ghz (C/C++)
L. Z. Xu: ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2020.
314 ART 54.23 % 75.05 % 48.19 % 20ms s 1 core @ 2.5 Ghz (C/C++)
315 VoxelJones code 53.96 % 66.21 % 47.66 % .18 s 1 core @ 2.5 Ghz (Python + C/C++)
M. Motro and J. Ghosh: Vehicular Multi-object Tracking with Persistent Detector Failures. arXiv preprint arXiv:1907.11306 2019.
316 TopNet-HighRes
This method makes use of Velodyne laser scans.
53.05 % 67.84 % 46.99 % 101ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
317 OC Stereo
This method uses stereo information.
code 51.47 % 68.89 % 42.97 % 0.35 s 1 core @ 2.5 Ghz (Python + C/C++)
A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D Object Detection. ICRA 2020.
318 YOLOStereo3D
This method uses stereo information.
code 50.28 % 76.10 % 36.86 % 0.1 s GPU 1080Ti
Y. Liu, L. Wang and M. Liu: YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. 2021 International Conference on Robotics and Automation (ICRA) 2021.
319 RT3DStereo
This method uses stereo information.
46.82 % 58.81 % 38.38 % 0.08 s GPU @ 2.5 Ghz (C/C++)
H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.
320 Pseudo-Lidar
This method uses stereo information.
code 45.00 % 67.30 % 38.40 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Wang, W. Chao, D. Garg, B. Hariharan, M. Campbell and K. Weinberger: Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.
321 RT3D
This method makes use of Velodyne laser scans.
44.00 % 56.44 % 42.34 % 0.09 s GPU @ 1.8Ghz
Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li and N. Sun: RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving. IEEE Robotics and Automation Letters 2018.
322 Stereo CenterNet
This method uses stereo information.
42.12 % 62.97 % 35.37 % 0.04 s GPU @ 2.5 Ghz (Python)
Y. Shi, Y. Guo, Z. Mi and X. Li: Stereo CenterNet-based 3D object detection for autonomous driving. Neurocomputing 2022.
323 SparseLiDAR_fusion 41.51 % 54.10 % 34.14 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
324 Stereo R-CNN
This method uses stereo information.
code 41.31 % 61.92 % 33.42 % 0.3 s GPU @ 2.5 Ghz (Python)
P. Li, X. Chen and S. Shen: Stereo R-CNN based 3D Object Detection for Autonomous Driving. CVPR 2019.
325 GCDR 37.34 % 50.85 % 30.51 % 0.28 s 1 core @ 2.5 Ghz (Python)
326 CIE + DM3D 33.13 % 46.17 % 28.80 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
327 StereoFENet
This method uses stereo information.
32.96 % 49.29 % 25.90 % 0.15 s 1 core @ 3.5 Ghz (Python)
W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.
328 Anonymous 30.81 % 43.11 % 26.81 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
329 Mobile Stereo R-CNN
This method uses stereo information.
28.78 % 44.51 % 22.30 % 1.8 s NVIDIA Jetson TX2
M. Hussein, M. Khalil and B. Abdullah: 3D Object Detection using Mobile Stereo R- CNN on Nvidia Jetson TX2. International Conference on Advanced Engineering, Technology and Applications (ICAETA) 2021.
330 CIE 28.50 % 41.41 % 23.88 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
331 Anonymous 27.70 % 37.81 % 24.61 % 40 s 1 core @ 2.5 Ghz (C/C++)
332 SARM3D 26.81 % 34.17 % 23.68 % 0.03 s GPU @ 2.5 Ghz (Python)
333 MDS-Mono3D 26.33 % 41.07 % 21.22 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
334 SSAL-Mono 26.17 % 33.15 % 23.81 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
335 CMKD code 25.82 % 38.98 % 22.80 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
Y. Hong, H. Dai and Y. Ding: Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection. ECCV 2022.
336 MoGDE 25.60 % 38.38 % 22.91 % 0.03 s GPU @ 2.5 Ghz (Python)
337 MonoASS 25.60 % 35.41 % 22.14 % 0.04 s 1 core @ 2.5 Ghz (Python)
338 AMNet 25.40 % 34.68 % 22.85 % 0.03 s GPU @ 1.0 Ghz (Python)
339 MonoXiver 25.37 % 34.14 % 22.20 % 0.03s GPU @ 2.5 Ghz (Python)
340 BSM3D 25.23 % 34.82 % 22.37 % 0.03 s 1 core @ 2.5 Ghz (Python)
341 LPCG-Monoflex code 24.81 % 35.96 % 21.86 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
L. Peng, F. Liu, Z. Yu, S. Yan, D. Deng, Z. Yang, H. Liu and D. Cai: Lidar Point Cloud Guided Monocular 3D Object Detection. ECCV 2022.
342 Anonymous 24.78 % 33.38 % 22.00 % 40 s 1 core @ 2.5 Ghz (C/C++)
343 DD3Dv2 code 24.67 % 35.70 % 21.73 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
344 MonoATT code 24.42 % 36.87 % 21.88 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
345 NeurOCS 24.41 % 37.38 % 20.95 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
346 Anonymous 23.82 % 34.35 % 20.80 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
347 PS-fld code 23.76 % 32.64 % 20.64 % 0.25 s 1 core @ 2.5 Ghz (C/C++)
Y. Chen, H. Dai and Y. Ding: Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
348 TempM3D 23.71 % 33.86 % 20.31 % 0.07 s 1 core @ 2.5 Ghz (Python)
349 ADD code 23.58 % 35.20 % 20.08 % 0.1 s 1 core @ 2.5 Ghz (Python)
350 MonoNeRD 23.46 % 31.13 % 20.97 % na s 1 core @ 2.5 Ghz (C/C++)
351 MonoDDE 23.46 % 33.58 % 20.37 % 0.04 s 1 core @ 2.5 Ghz (Python)
Z. Li, Z. Qu, Y. Zhou, J. Liu, H. Wang and L. Jiang: Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection. CVPR 2022.
352 MonoA^2 23.45 % 32.35 % 20.71 % na s 1 core @ 2.5 Ghz (C/C++)
353 BAIR 23.45 % 35.22 % 19.03 % 0.03 s 1 core @ 2.5 Ghz (Python)
354 DD3D code 23.41 % 32.35 % 20.42 % n/a s 1 core @ 2.5 Ghz (C/C++)
D. Park, R. Ambrus, V. Guizilini, J. Li and A. Gaidon: Is Pseudo-Lidar needed for Monocular 3D Object detection?. IEEE/CVF International Conference on Computer Vision (ICCV) .
355 MonoA^2(new) 23.14 % 31.71 % 20.45 % na s 1 core @ 2.5 Ghz (C/C++)
356 SAD 22.81 % 34.34 % 19.44 % 0.05 s 1 core @ 2.5 Ghz (python)
357 DID-M3D code 22.76 % 32.95 % 19.83 % 0.04 s 1 core @ 2.5 Ghz (Python)
L. Peng, X. Wu, Z. Yang, H. Liu and D. Cai: DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection. ECCV 2022.
358 MonoAD 22.70 % 33.33 % 20.48 % 0.03 s GPU @ 2.5 Ghz (Python)
359 zongmuDistill 22.56 % 33.48 % 19.88 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
360 OPA-3D code 22.53 % 33.54 % 19.22 % 0.04 s 1 core @ 3.5 Ghz (Python)
361 Shape-Aware 22.13 % 32.55 % 18.94 % 0.05 s 1 core @ 2.5 Ghz (Python)
362 Anonymous 22.05 % 31.75 % 19.44 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
363 3DSeMoDLE code 21.78 % 30.99 % 18.64 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
364 Anonymous 21.74 % 32.44 % 18.38 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
365 MDNet 21.71 % 33.31 % 18.49 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
366 MonoPPM code 21.66 % 30.54 % 18.64 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
367 SAD 21.56 % 33.90 % 19.08 % 0.05 s 1 core @ 2.5 Ghz (python)
368 Lite-FPN-GUPNet 21.53 % 31.68 % 18.38 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
369 DCD code 21.50 % 32.55 % 18.25 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
Y. Li, Y. Chen, J. He and Z. Zhang: Densely Constrained Depth Estimator for Monocular 3D Object Detection. European Conference on Computer Vision 2022.
370 MonoDETR code 21.45 % 32.20 % 18.68 % 0.04 s 1 core @ 2.5 Ghz (Python)
R. Zhang, H. Qiu, T. Wang, X. Xu, Z. Guo, Y. Qiao, P. Gao and H. Li: MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection. arXiv preprint arXiv:2203.13310 2022.
371 OBMO_GUPNet 21.41 % 30.81 % 18.37 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
372 M3DGAF 21.39 % 31.34 % 19.28 % 0.07 s 1 core @ 2.5 Ghz (Python)
373 SGM3D code 21.37 % 31.49 % 18.43 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
Z. Zhou, L. Du, X. Ye, Z. Zou, X. Tan, L. Zhang, X. Xue and J. Feng: SGM3D: Stereo Guided Monocular 3D Object Detection. RA-L 2022.
374 monopd code 21.29 % 32.12 % 18.08 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
375 DEPT 21.22 % 30.85 % 18.47 % 0.03 s 1 core @ 2.5 Ghz (Python)
376 Mono3DMethod 21.21 % 32.57 % 18.07 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
377 GUPNet code 21.19 % 30.29 % 18.20 % NA s 1 core @ 2.5 Ghz (Python + C/C++)
Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan and W. Ouyang: Geometry Uncertainty Projection Network for Monocular 3D Object Detection. arXiv preprint arXiv:2107.13774 2021.
378 MonoInsight 21.06 % 29.65 % 18.22 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
379 MM3D 20.93 % 31.44 % 18.72 % NA s 1 core @ 2.5 Ghz (C/C++)
380 HBD 20.91 % 29.87 % 18.22 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
381 GPENet code 20.79 % 30.31 % 18.21 % 0.02 s GPU @ 2.5 Ghz (Python)
382 BCA 20.75 % 30.03 % 17.60 % 0.17 s GPU @ 2.5 Ghz (Python)
383 LT-M3OD 20.74 % 29.40 % 17.83 % 0.03 s 1 core @ 2.5 Ghz (Python)
384 HomoLoss(monoflex) code 20.68 % 29.60 % 17.81 % 0.04 s 1 core @ 2.5 Ghz (Python)
J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homography Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
385 Anonymous 20.47 % 33.17 % 17.31 % 40 s 1 core @ 2.5 Ghz (C/C++)
386 DEVIANT code 20.44 % 29.65 % 17.43 % 0.04 s 1 GPU (Python)
A. Kumar, G. Brazil, E. Corona, A. Parchami and X. Liu: DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection. European Conference on Computer Vision (ECCV) 2022.
387 EW code 20.38 % 28.88 % 17.59 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
388 MonoDTR 20.38 % 28.59 % 17.14 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
K. Huang, T. Wu, H. Su and W. Hsu: MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer. CVPR 2022.
389 MDSNet 20.14 % 32.81 % 15.77 % 0.05 s 1 core @ 2.5 Ghz (Python)
Z. Xie, Y. Song, J. Wu, Z. Li, C. Song and Z. Xu: MDS-Net: Multi-Scale Depth Stratification 3D Object Detection from Monocular Images. Sensors 2022.
390 AutoShape code 20.08 % 30.66 % 15.95 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
Z. Liu, D. Zhou, F. Lu, J. Fang and L. Zhang: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.
391 MonoEdge-RCNN 20.07 % 27.62 % 16.34 % 0.05 s 1 core @ 2.5 Ghz (Python)
392 MonoPCNS 19.89 % 28.27 % 17.96 % 0.14 s GPU @ 2.5 Ghz (Python)
393 MonoFlex 19.75 % 28.23 % 16.89 % 0.03 s GPU @ 2.5 Ghz (Python)
Y. Zhang, J. Lu and J. Zhou: Objects are Different: Flexible Monocular 3D Object Detection. CVPR 2021.
394 MonoEF 19.70 % 29.03 % 17.26 % 0.03 s 1 core @ 2.5 Ghz (Python)
Y. Zhou, Y. He, H. Zhu, C. Wang, H. Li and Q. Jiang: Monocular 3D Object Detection: An Extrinsic Parameter Free Approach. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
395 HomoLoss(imvoxelnet) code 19.25 % 29.18 % 16.21 % 0.20 s 1 core @ 2.5 Ghz (Python)
J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homogrpahy Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
396 MonoAug 19.19 % 28.20 % 16.15 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
397 MK3D 19.18 % 29.11 % 15.78 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
398 DFR-Net 19.17 % 28.17 % 14.84 % 0.18 s 1080 Ti (Pytorch)
Z. Zou, X. Ye, L. Du, X. Cheng, X. Tan, L. Zhang, J. Feng, X. Xue and E. Ding: The devil is in the task: Exploiting reciprocal appearance-localization features for monocular 3d object detection . ICCV 2021.
399 DLE code 19.05 % 31.09 % 14.13 % 0.06 s NVIDIA Tesla V100
C. Liu, S. Gu, L. Gool and R. Timofte: Deep Line Encoding for Monocular 3D Object Detection and Depth Prediction. Proceedings of the British Machine Vision Conference (BMVC) 2021.
400 PCT code 19.03 % 29.65 % 15.92 % 0.045 s 1 core @ 2.5 Ghz (C/C++)
L. Wang, L. Zhang, Y. Zhu, Z. Zhang, T. He, M. Li and X. Xue: Progressive Coordinate Transforms for Monocular 3D Object Detection. NeurIPS 2021.
401 CaDDN code 18.91 % 27.94 % 17.19 % 0.63 s GPU @ 2.5 Ghz (Python)
C. Reading, A. Harakeh, J. Chae and S. Waslander: Categorical Depth Distribution Network for Monocular 3D Object Detection. CVPR 2021.
402 monodle code 18.89 % 24.79 % 16.00 % 0.04 s GPU @ 2.5 Ghz (Python)
X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li and W. Ouyang: Delving into Localization Errors for Monocular 3D Object Detection. CVPR 2021 .
403 Neighbor-Vote 18.65 % 27.39 % 16.54 % 0.1 s GPU @ 2.5 Ghz (Python)
X. Chu, J. Deng, Y. Li, Z. Yuan, Y. Zhang, J. Ji and Y. Zhang: Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting. ACM MM 2021.
404 MonoRCNN++ code 18.62 % 27.20 % 15.69 % 0.07 s GPU @ 2.5 Ghz (Python)
X. Shi, Z. Chen and T. Kim: Multivariate Probabilistic Monocular 3D Object Detection. WACV 2023.
405 GrooMeD-NMS code 18.27 % 26.19 % 14.05 % 0.12 s 1 core @ 2.5 Ghz (Python)
A. Kumar, G. Brazil and X. Liu: GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection. CVPR 2021.
406 MonoRCNN code 18.11 % 25.48 % 14.10 % 0.07 s GPU @ 2.5 Ghz (Python)
X. Shi, Q. Ye, X. Chen, C. Chen, Z. Chen and T. Kim: Geometry-based Distance Decomposition for Monocular 3D Object Detection. ICCV 2021.
407 Ground-Aware code 17.98 % 29.81 % 13.08 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
Y. Liu, Y. Yuan and M. Liu: Ground-aware Monocular 3D Object Detection for Autonomous Driving. IEEE Robotics and Automation Letters 2021.
408 Aug3D-RPN 17.89 % 26.00 % 14.18 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
C. He, J. Huang, X. Hua and L. Zhang: Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth. 2021.
409 DDMP-3D 17.89 % 28.08 % 13.44 % 0.18 s 1 core @ 2.5 Ghz (Python)
L. Wang, L. Du, X. Ye, Y. Fu, G. Guo, X. Xue, J. Feng and L. Zhang: Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection. CVPR 2020.
410 IAFA 17.88 % 25.88 % 15.35 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
D. Zhou, X. Song, Y. Dai, J. Yin, F. Lu, M. Liao, J. Fang and L. Zhang: IAFA: Instance-Aware Feature Aggregation for 3D Object Detection from a Single Image. Proceedings of the Asian Conference on Computer Vision 2020.
411 FMF-occlusion-net 17.60 % 27.39 % 13.25 % 0.16 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Liu, H. Liu, Y. Wang, F. Sun and W. Huang: Fine-grained Multi-level Fusion for Anti- occlusion Monocular 3D Object Detection. IEEE Transactions on Image Processing 2022.
412 RefinedMPL 17.60 % 28.08 % 13.95 % 0.15 s GPU @ 2.5 Ghz (Python + C/C++)
J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.
413 Kinematic3D code 17.52 % 26.69 % 13.10 % 0.12 s 1 core @ 1.5 Ghz (C/C++)
G. Brazil, G. Pons-Moll, X. Liu and B. Schiele: Kinematic 3D Object Detection in Monocular Video. ECCV 2020 .
414 MonoRUn code 17.34 % 27.94 % 15.24 % 0.07 s GPU @ 2.5 Ghz (Python + C/C++)
H. Chen, Y. Huang, W. Tian, Z. Gao and L. Xiong: MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
415 AM3D 17.32 % 25.03 % 14.91 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
X. Ma, Z. Wang, H. Li, P. Zhang, W. Ouyang and X. Fan: Accurate Monocular Object Detection via Color- Embedded 3D Reconstruction for Autonomous Driving. Proceedings of the IEEE international Conference on Computer Vision (ICCV) 2019.
416 YoloMono3D code 17.15 % 26.79 % 12.56 % 0.05 s GPU @ 2.5 Ghz (Python)
Y. Liu, L. Wang and L. Ming: YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. 2021 International Conference on Robotics and Automation (ICRA) 2021.
417 CMAN 17.04 % 25.89 % 12.88 % 0.15 s 1 core @ 2.5 Ghz (Python)
C. Yuanzhouhan Cao: CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection. IEEE Trans. Intell. Transport. Syst. 2022.
418 GAC3D 16.93 % 25.80 % 12.50 % 0.25 s 1 core @ 2.5 Ghz (Python)
M. Bui, D. Ngo, H. Pham and D. Nguyen: GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution. 2021.
419 PatchNet code 16.86 % 22.97 % 14.97 % 0.4 s 1 core @ 2.5 Ghz (C/C++)
X. Ma, S. Liu, Z. Xia, H. Zhang, X. Zeng and W. Ouyang: Rethinking Pseudo-LiDAR Representation. Proceedings of the European Conference on Computer Vision (ECCV) 2020.
420 MonoAug 16.71 % 24.39 % 13.83 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
421 PGD-FCOS3D code 16.51 % 26.89 % 13.49 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
T. Wang, X. Zhu, J. Pang and D. Lin: Probabilistic and Geometric Depth: Detecting Objects in Perspective. Conference on Robot Learning (CoRL) 2021.
422 MDT code 16.47 % 24.22 % 13.42 % 0.01 s 1 core @ 2.5 Ghz (Python)
423 ImVoxelNet code 16.37 % 25.19 % 13.58 % 0.2 s GPU @ 2.5 Ghz (Python)
D. Rukhovich, A. Vorontsova and A. Konushin: ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection. arXiv preprint arXiv:2106.01178 2021.
424 KM3D code 16.20 % 23.44 % 14.47 % 0.03 s 1 core @ 2.5 Ghz (Python)
P. Li: Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training. 2020.
425 D4LCN code 16.02 % 22.51 % 12.55 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for Monocular 3D Object Detection. CVPR 2020.
426 MonoPair 14.83 % 19.28 % 12.89 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
427 Decoupled-3D 14.82 % 23.16 % 11.25 % 0.08 s GPU @ 2.5 Ghz (C/C++)
Y. Cai, B. Li, Z. Jiao, H. Li, X. Zeng and X. Wang: Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation. AAAI 2020.
428 QD-3DT
This is an online method (no batch processing).
code 14.71 % 20.16 % 12.76 % 0.03 s GPU @ 2.5 Ghz (Python)
H. Hu, Y. Yang, T. Fischer, F. Yu, T. Darrell and M. Sun: Monocular Quasi-Dense 3D Object Tracking. ArXiv:2103.07351 2021.
429 SMOKE code 14.49 % 20.83 % 12.75 % 0.03 s GPU @ 2.5 Ghz (Python)
Z. Liu, Z. Wu and R. Tóth: SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. 2020.
430 RTM3D code 14.20 % 19.17 % 11.99 % 0.05 s GPU @ 1.0 Ghz (Python)
P. Li, H. Zhao, P. Liu and F. Cao: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving. 2020.
431 Mono3D_PLiDAR code 13.92 % 21.27 % 11.25 % 0.1 s NVIDIA GeForce 1080 (pytorch)
X. Weng and K. Kitani: Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud. arXiv:1903.09847 2019.
432 M3D-RPN code 13.67 % 21.02 % 10.23 % 0.16 s GPU @ 1.5 Ghz (Python)
G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal Network for Object Detection . ICCV 2019 .
433 CSoR
This method makes use of Velodyne laser scans.
13.07 % 18.67 % 10.34 % 3.5 s 4 cores @ >3.5 Ghz (Python + C/C++)
L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.
434 MonoPSR code 12.58 % 18.33 % 9.91 % 0.2 s GPU @ 3.5 Ghz (Python)
J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction. CVPR 2019.
435 MonoCInIS 11.64 % 22.28 % 9.95 % 0,13 s GPU @ 2.5 Ghz (C/C++)
J. Heylen, M. De Wolf, B. Dawagne, M. Proesmans, L. Van Gool, W. Abbeloos, H. Abdelkawy and D. Reino: MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.
436 SS3D 11.52 % 16.33 % 9.93 % 48 ms Tesla V100 (Python)
E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. CoRR 2019.
437 MonoGRNet code 11.17 % 18.19 % 8.73 % 0.04s NVIDIA P40
Z. Qin, J. Wang and Y. Lu: MonoGRNet: A Geometric Reasoning Network for 3D Object Localization. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) 2019.
438 MonoFENet 11.03 % 17.03 % 9.05 % 0.15 s 1 core @ 3.5 Ghz (Python)
W. Bao, B. Xu and Z. Chen: MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks. IEEE Transactions on Image Processing 2019.
439 MonoCInIS 10.96 % 20.42 % 9.23 % 0,14 s GPU @ 2.5 Ghz (Python)
J. Heylen, M. De Wolf, B. Dawagne, M. Proesmans, L. Van Gool, W. Abbeloos, H. Abdelkawy and D. Reino: MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021.
440 A3DODWTDA (image) code 8.66 % 10.37 % 7.06 % 0.8 s GPU @ 3.0 Ghz (Python)
F. Gustafsson and E. Linder-Norén: Automotive 3D Object Detection Without Target Domain Annotations. 2018.
441 TLNet (Stereo)
This method uses stereo information.
code 7.69 % 13.71 % 6.73 % 0.1 s 1 core @ 2.5 Ghz (Python)
Z. Qin, J. Wang and Y. Lu: Triangulation Learning Network: from Monocular to Stereo 3D Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.
442 Shift R-CNN (mono) code 6.82 % 11.84 % 5.27 % 0.25 s GPU @ 1.5 Ghz (Python)
A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D Object Detection With Closed-form Geometric Constraints. ICIP 2019.
443 SparVox3D 6.39 % 10.20 % 5.06 % 0.05 s GPU @ 2.0 Ghz (Python)
E. Balatkan and F. Kıraç: Improving Regression Performance on Monocular 3D Object Detection Using Bin-Mixing and Sparse Voxel Data. 2021 6th International Conference on Computer Science and Engineering (UBMK) 2021.
444 GS3D 6.08 % 8.41 % 4.94 % 2 s 1 core @ 2.5 Ghz (C/C++)
B. Li, W. Ouyang, L. Sheng, X. Zeng and X. Wang: GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.
445 MVRA + I-FRCNN+ 5.84 % 9.05 % 4.50 % 0.18 s GPU @ 2.5 Ghz (Python)
H. Choi, H. Kang and Y. Hyun: Multi-View Reprojection Architecture for Orientation Estimation. The IEEE International Conference on Computer Vision (ICCV) Workshops 2019.
446 WeakM3D code 5.66 % 11.82 % 4.08 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
L. Peng, S. Yan, B. Wu, Z. Yang, X. He and D. Cai: WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection. ICLR 2022.
447 ROI-10D 4.91 % 9.78 % 3.74 % 0.2 s GPU @ 3.5 Ghz (Python)
F. Manhardt, W. Kehl and A. Gaidon: ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape. Computer Vision and Pattern Recognition (CVPR) 2019.
448 CDTrack3D code 4.61 % 7.02 % 3.73 % 0.0106 s NVIDIA RTX 3090 GPU, i9 10850k CPU
449 3D-GCK 4.57 % 5.79 % 3.64 % 24 ms Tesla V100
N. Gählert, J. Wan, N. Jourdan, J. Finkbeiner, U. Franke and J. Denzler: Single-Shot 3D Detection of Vehicles from Monocular RGB Images via Geometrically Constrained Keypoints in Real-Time. 2020 IEEE Intelligent Vehicles Symposium (IV) 2020.
450 FQNet 3.23 % 5.40 % 2.46 % 0.5 s 1 core @ 2.5 Ghz (Python)
L. Liu, J. Lu, C. Xu, Q. Tian and J. Zhou: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.
451 3D-SSMFCNN code 2.63 % 3.20 % 2.40 % 0.1 s GPU @ 1.5 Ghz (C/C++)
L. Novak: Vehicle Detection and Pose Estimation for Autonomous Driving. 2017.
452 VeloFCN
This method makes use of Velodyne laser scans.
0.14 % 0.02 % 0.21 % 1 s GPU @ 2.5 Ghz (Python + C/C++)
B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .
453 MonoDET code 0.14 % 0.25 % 0.10 % 0.04 s 1 core @ 2.5 Ghz (Python)
454 test code 0.09 % 0.04 % 0.11 % 50 s 1 core @ 2.5 Ghz (Python)
455 Yolo5x6_Ghost 0.00 % 0.00 % 0.00 % 0.03 s GPU @ 2.5 Ghz (Python)
456 Yolo5x6_Ghost 0.00 % 0.00 % 0.00 % 0.03 s GPU @ 2.5 Ghz (Python)
457 multi-task CNN 0.00 % 0.00 % 0.00 % 25.1 ms GPU @ 2.0 Ghz (Python)
M. Oeljeklaus, F. Hoffmann and T. Bertram: A Fast Multi-Task CNN for Spatial Understanding of Traffic Scenes. IEEE Intelligent Transportation Systems Conference 2018.
458 Ghost3D object detec 0.00 % 0.00 % 0.00 % 0.03 s 1 core @ 2.5 Ghz (Python)
459 aliii0 0.00 % 0.00 % 0.00 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
460 Res 0.00 % 0.00 % 0.00 % 0.03 s 1 core @ 2.5 Ghz (Python)
461 GHos_3d 0.00 % 0.00 % 0.00 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
462 ALI_TRY1 0.00 % 0.00 % 0.00 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
463 BEV_GHOST 0.00 % 0.00 % 0.00 % 0.1 s 1 core @ 2.5 Ghz (Python)
464 mBoW
This method makes use of Velodyne laser scans.
0.00 % 0.00 % 0.00 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
Table as LaTeX | Only published Methods

Pedestrian


Method Setting Code Moderate Easy Hard Runtime Environment
1 PiFeNet code 53.92 % 63.25 % 50.53 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
D. Le, H. Shi, H. Rezatofighi and J. Cai: Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network. arXiv preprint arXiv:2112.15458 2022.
2 CasA++ code 53.84 % 60.14 % 51.35 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
3 TED 53.48 % 60.13 % 50.89 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, C. Wen, W. Li, R. Yang and C. Wang: Transformation-Equivariant 3D Object Detection for Autonomous Driving. AAAI 2023.
4 BiProDet 53.32 % 58.91 % 50.82 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
5 DCAN-Second code 53.18 % 60.92 % 50.56 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
6 EQ-PVRCNN code 52.81 % 61.73 % 49.87 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, L. Jiang, Y. Sun, B. Schiele and J. Jia: A Unified Query-based Paradigm for Point Cloud Understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.
7 VPFNet code 52.41 % 60.07 % 50.28 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
C. Wang, H. Chen and L. Fu: VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection. 2021.
8 Frustum-PointPillars code 52.23 % 60.98 % 48.30 % 0.06 s 4 cores @ 3.0 Ghz (Python)
A. Paigwar, D. Sierra-Gonzalez, \. Erkent and C. Laugier: Frustum-PointPillars: A Multi-Stage Approach for 3D Object Detection using RGB Camera and LiDAR. International Conference on Computer Vision, ICCV, Workshop on Autonomous Vehicle Vision 2021.
9 CAD 52.20 % 60.23 % 49.54 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
10 LoGoNet 52.06 % 58.24 % 49.87 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
11 TANet code 51.38 % 60.85 % 47.54 % 0.035s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from Point Clouds with Triple Attention. AAAI 2020.
12 CasA code 51.37 % 57.95 % 49.08 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
13 MMLab PV-RCNN
This method makes use of Velodyne laser scans.
code 50.57 % 59.86 % 46.74 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. CVPR 2020.
14 HotSpotNet 50.53 % 57.39 % 46.65 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille: object as hotspots. Proceedings of the European Conference on Computer Vision (ECCV) 2020.
15 VMVS
This method makes use of Velodyne laser scans.
50.34 % 60.34 % 46.45 % 0.25 s GPU @ 2.5 Ghz (Python)
J. Ku, A. Pon, S. Walsh and S. Waslander: Improving 3D object detection for pedestrians with virtual multi-view synthesis orientation estimation. IROS 2019.
16 AVOD-FPN
This method makes use of Velodyne laser scans.
code 50.32 % 58.49 % 46.98 % 0.1 s Titan X (Pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
17 SPT 50.22 % 56.54 % 46.72 % 0.1 s GPU @ 2.5 Ghz (Python)
18 variance_point 50.03 % 57.72 % 46.27 % 0.05 s 1 core @ 2.5 Ghz (Python)
19 3DSSD code 49.94 % 60.54 % 45.73 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu and J. Jia: 3DSSD: Point-based 3D Single Stage Object Detector. CVPR 2020.
20 PointPainting
This method makes use of Velodyne laser scans.
49.93 % 58.70 % 46.29 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object Detection. CVPR 2020.
21 SemanticVoxels 49.93 % 58.91 % 47.31 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
J. Fei, W. Chen, P. Heidenreich, S. Wirges and C. Stiller: SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation. MFI 2020.
22 ACDet code 49.82 % 58.35 % 47.17 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
J. Xu, G. Wang, X. Zhang and G. Wan: ACDet: Attentive Cross-view Fusion for LiDAR-based 3D Object Detection. 3DV 2022.
23 MMLab-PartA^2
This method makes use of Velodyne laser scans.
code 49.81 % 59.04 % 45.92 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.
24 F-PointNet
This method makes use of Velodyne laser scans.
code 49.57 % 57.13 % 45.48 % 0.17 s GPU @ 3.0 Ghz (Python)
C. Qi, W. Liu, C. Wu, H. Su and L. Guibas: Frustum PointNets for 3D Object Detection from RGB-D Data. arXiv preprint arXiv:1711.08488 2017.
25 CFF-tv 49.29 % 57.83 % 46.70 % 1 s 1 core @ 2.5 Ghz (C/C++)
26 F-ConvNet
This method makes use of Velodyne laser scans.
code 48.96 % 57.04 % 44.33 % 0.47 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. IROS 2019.
27 HVNet 48.86 % 54.84 % 46.33 % 0.03 s GPU @ 2.0 Ghz (Python)
M. Ye, S. Xu and T. Cao: HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection. CVPR 2020.
28 CAT-Det 48.78 % 57.13 % 45.56 % 0.3 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Zhang, J. Chen and D. Huang: CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection. CVPR 2022.
29 STD code 48.72 % 60.02 % 44.55 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia: STD: Sparse-to-Dense 3D Object Detector for Point Cloud. ICCV 2019.
30 Reprod-Two-Branch 48.71 % 57.25 % 45.75 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
31 KeyFuse2B 48.64 % 56.16 % 46.20 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
32 PointPillars
This method makes use of Velodyne laser scans.
code 48.64 % 57.60 % 45.78 % 16 ms 1080ti GPU and Intel i7 CPU
A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from Point Clouds. CVPR 2019.
33 USVLab BSAODet 48.61 % 55.76 % 46.08 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
34 FV2P v2 48.58 % 54.90 % 45.11 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
35 EPNet++ 48.47 % 56.24 % 45.73 % 0.1 s GPU @ 2.5 Ghz (Python)
Z. Liu, H. tengteng, B. Li, X. Chen, X. Wang and X. Bai: EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection. arXiv preprint arXiv:2112.11088 2021.
36 MGAF-3DSSD code 48.46 % 56.09 % 44.90 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, H. Dai, L. Shao and Y. Ding: Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
37 CFF-ep25 48.31 % 56.34 % 45.58 % 1 s 1 core @ 2.5 Ghz (C/C++)
38 Fast-CLOCs 48.27 % 57.19 % 44.55 % 0.1 s GPU @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022.
39 FromVoxelToPoint code 48.15 % 56.54 % 45.63 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, H. Dai, L. Shao and Y. Ding: From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to- Point Decoder. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
40 cff-tv-v2-ep25 48.13 % 56.48 % 45.66 % 1 s 1 core @ 2.5 Ghz (C/C++)
41 USVLab BSAODet (S) 48.10 % 54.96 % 45.65 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
42 HMFI code 47.77 % 55.61 % 45.17 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
X. Li, B. Shi, Y. Hou, X. Wu, T. Ma, Y. Li and L. He: Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection. ECCV 2022.
43 TBD 47.77 % 55.61 % 45.17 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
44 CFF-tv-v2 47.59 % 55.46 % 45.09 % 1 s 1 core @ 2.5 Ghz (C/C++)
45 VoCo 47.47 % 52.94 % 45.41 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
46 CF-ctdep-tv-ta 47.46 % 54.36 % 45.07 % 1 s 1 core @ 2.5 Ghz (C/C++)
47 P2V-RCNN 47.36 % 54.15 % 45.10 % 0.1 s 2 cores @ 2.5 Ghz (Python)
J. Li, S. Luo, Z. Zhu, H. Dai, A. Krylov, Y. Ding and L. Shao: P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds. IEEE Access 2021.
48 CF-base-tv 47.28 % 54.77 % 44.81 % 1 s 1 core @ 2.5 Ghz (C/C++)
49 CZY_PPF_Net2 47.22 % 51.95 % 45.46 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
50 Self-Calib Conv 47.17 % 54.20 % 44.84 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
51 TCDVF 47.11 % 55.26 % 44.53 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
52 Point-GNN
This method makes use of Velodyne laser scans.
code 47.07 % 55.36 % 44.61 % 0.6 s GPU @ 2.5 Ghz (Python)
W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. CVPR 2020.
53 MVMM code 46.84 % 53.75 % 44.87 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
54 SCNet
This method makes use of Velodyne laser scans.
46.73 % 56.87 % 42.74 % 0.04 s GPU @ 3.0 Ghz (Python)
Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.
55 cp-tv-kp 46.71 % 53.73 % 44.45 % 1 s 1 core @ 2.5 Ghz (C/C++)
56 SGDA3D 46.66 % 52.65 % 44.62 % 0.07 s 1 core @ 2.5 Ghz (Python)
57 Anonymous
This method makes use of Velodyne laser scans.
46.65 % 52.20 % 44.61 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
58 DGT-Det3D code 46.59 % 54.25 % 44.15 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
59 PSA-Det3D 46.36 % 53.26 % 43.73 % 0.1 s GPU @ 2.5 Ghz (Python)
60 CF-ctdep-tv 46.36 % 53.50 % 44.01 % 1 s 1 core @ 2.5 Ghz (C/C++)
61 CZY_3917 46.31 % 51.01 % 44.44 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
62 PA-RCNN code 46.30 % 53.60 % 44.33 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
63 3SNet 46.25 % 52.22 % 42.89 % 0.07 s GPU @ 2.5 Ghz (Python)
64 DGT-Det3D 46.22 % 53.98 % 43.85 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
65 DTE3D 46.18 % 53.38 % 43.52 % 0.19 s 1 core @ 2.5 Ghz (C/C++)
66 MMLab-PointRCNN
This method makes use of Velodyne laser scans.
code 46.13 % 54.77 % 42.84 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.
67 Anonymous 46.13 % 55.51 % 43.60 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
68 ARPNET 45.92 % 55.48 % 42.54 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network for 3D object detection. Science China Information Sciences 2019.
69 Under Blind Review#2 45.85 % 52.35 % 44.00 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
70 CenterFuse 45.84 % 55.20 % 43.46 % 0.059 sec/frame 2 x V100
71 DSA-PV-RCNN
This method makes use of Velodyne laser scans.
code 45.82 % 52.03 % 43.81 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
P. Bhattacharyya, C. Huang and K. Czarnecki: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection. 2021.
72 U_SECOND_V4 45.79 % 53.57 % 43.52 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
73 cp-tv 45.75 % 52.90 % 43.49 % 1 s 1 core @ 2.5 Ghz (C/C++)
74 SVGA-Net 45.68 % 53.09 % 43.30 % 0.03s 1 core @ 2.5 Ghz (Python + C/C++)
Q. He, Z. Wang, H. Zeng, Y. Zeng and Y. Liu: SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds. AAAI 2022.
75 TBD 45.57 % 52.08 % 42.35 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
76 Anonymous 45.53 % 53.94 % 43.02 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
77 Anonymous 45.50 % 54.84 % 42.71 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
78 epBRM
This method makes use of Velodyne laser scans.
code 45.49 % 52.48 % 42.75 % 0.10 s 1 core @ 2.5 Ghz (C/C++)
K. Shin: Improving a Quality of 3D Object Detection by Spatial Transformation Mechanism. arXiv preprint arXiv:1910.04853 2019.
79 KPSCC code 45.46 % 52.72 % 42.53 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
80 PDV code 45.45 % 51.95 % 43.33 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Hu, T. Kuai and S. Waslander: Point Density-Aware Voxels for LiDAR 3D Object Detection. CVPR 2022.
81 MLOD
This method makes use of Velodyne laser scans.
code 45.40 % 55.09 % 41.42 % 0.12 s GPU @ 1.5 Ghz (Python)
J. Deng and K. Czarnecki: MLOD: A multi-view 3D object detection based on robust feature fusion method. arXiv preprint arXiv:1909.04163 2019.
82 cp-tv-kp-io-sc 45.30 % 53.84 % 42.12 % 1 s 1 core @ 2.5 Ghz (C/C++)
83 U_PVRCNN_V2 45.23 % 51.52 % 42.55 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
84 VPNet 45.12 % 52.68 % 42.05 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
85 IA-SSD (single) code 45.07 % 52.73 % 42.75 % 0.013 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
86 SRDL 44.84 % 52.42 % 42.56 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
ERROR: Wrong syntax in BIBTEX file.
87 PVRCNN_8369 44.83 % 52.41 % 42.57 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
88 CZY_PPF_Net 44.80 % 49.97 % 42.11 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
89 M3DeTR code 44.78 % 50.63 % 42.57 % n/a s GPU @ 1.0 Ghz (Python)
T. Guan, J. Wang, S. Lan, R. Chandra, Z. Wu, L. Davis and D. Manocha: M3DeTR: Multi-representation, Multi- scale, Mutual-relation 3D Object Detection with Transformers. 2021.
90 WGVRF 44.75 % 50.80 % 42.78 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
91 Semantical PVRCNN 44.75 % 49.40 % 41.94 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
92 AFTD 44.74 % 53.94 % 42.36 % 1 s 1 core @ 2.5 Ghz (Python + C/C++)
93 U_RVRCNN_V2_1 44.73 % 51.76 % 42.62 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
94 Dune-DCF-e11 44.58 % 52.44 % 41.75 % 1 s 1 core @ 2.5 Ghz (C/C++)
95 CF-cd-io-tv 44.54 % 53.64 % 41.21 % 1 s 1 core @ 2.5 Ghz (C/C++)
96 Dune-DCF-e09 44.50 % 52.64 % 41.86 % 1 s 1 core @ 2.5 Ghz (C/C++)
97 SIF 44.28 % 52.05 % 42.03 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
P. An: SIF. Submitted to CVIU 2021.
98 LazyTorch-CP-Infer-O 44.27 % 51.92 % 41.99 % 1 s 1 core @ 2.5 Ghz (C/C++)
99 KeyPoint-IoUHead 44.27 % 53.12 % 41.83 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
100 LazyTorch-CP-Small-P 44.25 % 51.84 % 41.97 % 1 s 1 core @ 2.5 Ghz (C/C++)
101 IoU-2B 44.19 % 55.31 % 40.33 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
102 DVFENet 44.12 % 50.98 % 41.62 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang: DVFENet: Dual-branch Voxel Feature Extraction Network for 3D Object Detection. Neurocomputing 2021.
103 CenterPoint (pcdet) 44.08 % 51.76 % 41.80 % 0.051 sec/frame 2 x V100
104 CrazyTensor-CP 44.06 % 51.25 % 41.50 % 1 s 1 core @ 2.5 Ghz (Python)
105 cff-tv-t 44.00 % 54.42 % 41.46 % 1 s 1 core @ 2.5 Ghz (C/C++)
106 CF-base-train 43.90 % 51.40 % 41.24 % 1 s 1 core @ 2.5 Ghz (C/C++)
107 VGA-RCNN 43.89 % 51.80 % 41.57 % 0.07 s 1 core @ 2.5 Ghz (Python)
108 GS-FPS 43.88 % 50.53 % 40.93 % TBD s 1 core @ 2.5 Ghz (C/C++)
109 IKT3D
This method makes use of Velodyne laser scans.
43.88 % 49.25 % 41.79 % 0.05 s 1 core @ 2.5 Ghz (Python)
110 City-CF-fixed 43.86 % 51.92 % 41.33 % 1 s 1 core @ 2.5 Ghz (C/C++)
111 Faraway-Frustum
This method makes use of Velodyne laser scans.
code 43.85 % 52.15 % 41.68 % 0.1 s GPU @ 2.5 Ghz (Python)
H. Zhang, D. Yang, E. Yurtsever, K. Redmill and U. Ozguner: Faraway-frustum: Dealing with lidar sparsity for 3D object detection using fusion. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) 2021.
112 PSA-SSD 43.77 % 50.26 % 41.75 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
113 BASA 43.67 % 50.82 % 40.91 % 1s 1 core @ 2.5 Ghz (python)
114 Dune-DCF-e15 43.63 % 51.18 % 41.11 % 1 s 1 core @ 2.5 Ghz (C/C++)
115 AGS-SSD[la] 43.60 % 51.06 % 40.37 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
116 IPS 43.58 % 50.38 % 41.54 % TBD s 1 core @ 2.5 Ghz (C/C++)
117 S-AT GCN 43.43 % 50.63 % 41.58 % 0.02 s GPU @ 2.0 Ghz (Python)
L. Wang, C. Wang, X. Zhang, T. Lan and J. Li: S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection. CoRR 2021.
118 CF-ctdep-train 43.20 % 50.14 % 40.69 % 1 s 1 core @ 2.5 Ghz (C/C++)
119 GEO_LOC 43.10 % 49.74 % 41.02 % TBD s 1 core @ 2.5 Ghz (C/C++)
120 HPV-RCNN 42.99 % 50.53 % 39.54 % 0.15 s 1 core @ 2.5 Ghz (Python)
121 City-CF 42.95 % 49.91 % 40.61 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
122 GS-FPS-LT 42.89 % 49.33 % 40.63 % TBD s 1 core @ 2.5 Ghz (C/C++)
123 BirdNet+
This method makes use of Velodyne laser scans.
code 42.87 % 48.90 % 40.59 % 0.11 s Titan Xp (PyTorch)
A. Barrera, J. Beltrán, C. Guindel, J. Iglesias and F. García: BirdNet+: Two-Stage 3D Object Detection in LiDAR through a Sparsity-Invariant Bird’s Eye View. IEEE Access 2021.
124 CZY 42.80 % 49.42 % 40.83 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
125 TBD 42.76 % 50.17 % 39.75 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
126 IA-SSD (multi) code 42.61 % 51.76 % 40.51 % 0.014 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
127 NV-RCNN 42.58 % 49.00 % 40.39 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
128 XView 42.42 % 47.24 % 39.96 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
L. Xie, G. Xu, D. Cai and X. He: X-view: Non-egocentric Multi-View 3D Object Detector. 2021.
129 PVTr 42.26 % 48.79 % 40.27 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
130 ATT_SSD 42.24 % 49.55 % 40.13 % 0.01 s 1 core @ 2.5 Ghz (Python)
131 T_PVRCNN_V2 42.21 % 50.58 % 39.81 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
132 T_PVRCNN 41.87 % 49.87 % 39.44 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
133 SWA code 41.57 % 48.98 % 39.32 % 0.18 s 1 core @ 2.5 Ghz (C/C++)
134 TTT_SSD 41.19 % 47.42 % 39.19 % TBD s 1 core @ 2.5 Ghz (C/C++)
135 SECOND_7862 40.96 % 47.55 % 38.85 % 1 s 1 core @ 2.5 Ghz (Python)
136 PFF3D
This method makes use of Velodyne laser scans.
code 40.94 % 48.74 % 38.54 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
L. Wen and K. Jo: Fast and Accurate 3D Object Detection for Lidar-Camera-Based Autonomous Vehicles Using One Shared Voxel-Based Backbone. IEEE Access 2021.
137 CAD
This method uses stereo information.
This method makes use of Velodyne laser scans.
40.93 % 48.07 % 38.43 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
138 CrazyTensor-CF 40.78 % 48.79 % 38.16 % 1 s 1 core @ 2.5 Ghz (C/C++)
139 ZMMPP 39.11 % 46.50 % 37.04 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
140 DSGN++
This method uses stereo information.
code 38.92 % 50.26 % 35.12 % 0.2 s GeForce RTX 2080Ti
Y. Chen, S. Huang, S. Liu, B. Yu and J. Jia: DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors. arXiv preprint arXiv:2204.03039 2022.
141 AB3DMOT
This method makes use of Velodyne laser scans.
This is an online method (no batch processing).
code 38.79 % 47.51 % 35.85 % 0.0047s 1 core @ 2.5 Ghz (python)
X. Weng and K. Kitani: A Baseline for 3D Multi-Object Tracking. arXiv:1907.03961 2019.
142 BirdNet+ (legacy)
This method makes use of Velodyne laser scans.
code 38.28 % 45.53 % 35.37 % 0.1 s Titan Xp (PyTorch)
A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird’s Eye View. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
143 LightCPC code 38.17 % 44.37 % 36.04 % 0.02 s 1 core @ 2.5 Ghz (Python + C/C++)
144 CSW3D
This method makes use of Velodyne laser scans.
37.96 % 49.27 % 33.83 % 0.03 s 4 cores @ 2.5 Ghz (C/C++)
J. Hu, T. Wu, H. Fu, Z. Wang and K. Ding: Cascaded Sliding Window Based Real-Time 3D Region Proposal for Pedestrian Detection. ROBIO 2019.
145 KPP3D code 37.82 % 45.25 % 35.36 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
146 StereoDistill 37.75 % 50.79 % 34.28 % 0.4 s 1 core @ 2.5 Ghz (Python)
147 DMF
This method uses stereo information.
34.92 % 42.08 % 32.69 % 0.2 s 1 core @ 2.5 Ghz (Python + C/C++)
X. J. Chen and W. Xu: Disparity-Based Multiscale Fusion Network for Transportation Detection. IEEE Transactions on Intelligent Transportation Systems 2022.
148 SparsePool code 34.15 % 43.33 % 31.78 % 0.13 s 8 cores @ 2.5 Ghz (Python)
Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and front view camera image for deep object detection. arXiv preprint arXiv:1711.06703 2017.
149 MMLAB LIGA-Stereo
This method uses stereo information.
code 34.13 % 44.71 % 30.42 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
X. Guo, S. Shi, X. Wang and H. Li: LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.
150 AVOD
This method makes use of Velodyne laser scans.
code 33.57 % 42.58 % 30.14 % 0.08 s Titan X (pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
151 SparsePool code 33.22 % 41.55 % 29.66 % 0.13 s 8 cores @ 2.5 Ghz (Python)
Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and front view camera image for deep object detection. arXiv preprint arXiv:1711.06703 2017.
152 Pseudo-Stereo++ 32.38 % 43.37 % 28.66 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
153 CZY 32.05 % 39.50 % 29.90 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
154 PS 31.13 % 41.55 % 27.50 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
155 CG-Stereo
This method uses stereo information.
29.56 % 39.24 % 25.87 % 0.57 s GeForce RTX 2080 Ti
C. Li, J. Ku and S. Waslander: Confidence Guided Stereo 3D Object Detection with Split Depth Estimation. IROS 2020.
156 PointRGBNet 29.32 % 38.07 % 26.94 % 0.08 s 4 cores @ 2.5 Ghz (Python + C/C++)
P. Xie Desheng: Real-time Detection of 3D Objects Based on Multi-Sensor Information Fusion. Automotive Engineering 2022.
157 Disp R-CNN
This method uses stereo information.
code 29.12 % 42.72 % 25.09 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
158 Disp R-CNN (velo)
This method uses stereo information.
code 28.34 % 40.21 % 24.46 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
159 BirdNet
This method makes use of Velodyne laser scans.
23.06 % 28.20 % 21.65 % 0.11 s Titan Xp (Caffe)
J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework from LiDAR Information. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
160 OC Stereo
This method uses stereo information.
code 20.80 % 29.79 % 18.62 % 0.35 s 1 core @ 2.5 Ghz (Python + C/C++)
A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D Object Detection. ICRA 2020.
161 YOLOStereo3D
This method uses stereo information.
code 20.76 % 31.01 % 18.41 % 0.1 s GPU 1080Ti
Y. Liu, L. Wang and M. Liu: YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. 2021 International Conference on Robotics and Automation (ICRA) 2021.
162 DSGN
This method uses stereo information.
code 20.75 % 26.61 % 18.86 % 0.67 s NVIDIA Tesla V100
Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D Object Detection. CVPR 2020.
163 Complexer-YOLO
This method makes use of Velodyne laser scans.
18.26 % 21.42 % 17.06 % 0.06 s GPU @ 3.5 Ghz (C/C++)
M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019.
164 TopNet-Retina
This method makes use of Velodyne laser scans.
14.57 % 18.04 % 12.48 % 52ms GeForce 1080Ti (tensorflow-gpu, v1.12)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
165 RT3D-GMP
This method uses stereo information.
14.22 % 19.92 % 12.83 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
H. Königshof and C. Stiller: Learning-Based Shape Estimation with Grid Map Patches for Realtime 3D Object Detection for Automated Driving. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
166 TopNet-HighRes
This method makes use of Velodyne laser scans.
13.50 % 19.43 % 11.93 % 101ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
167 Anonymous 13.47 % 20.42 % 11.64 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
168 ESGN
This method uses stereo information.
13.03 % 17.94 % 11.54 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
A. Gao, Y. Pang, J. Nie, Z. Shao, J. Cao, Y. Guo and X. Li: ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2022.
169 DD3D code 12.51 % 18.58 % 10.65 % n/a s 1 core @ 2.5 Ghz (C/C++)
D. Park, R. Ambrus, V. Guizilini, J. Li and A. Gaidon: Is Pseudo-Lidar needed for Monocular 3D Object detection?. IEEE/CVF International Conference on Computer Vision (ICCV) .
170 DEPT 12.29 % 18.05 % 10.50 % 0.03 s 1 core @ 2.5 Ghz (Python)
171 PS-fld code 12.23 % 19.03 % 10.53 % 0.25 s 1 core @ 2.5 Ghz (C/C++)
Y. Chen, H. Dai and Y. Ding: Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
172 DD3Dv2 code 12.16 % 17.74 % 10.49 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
173 CIE 11.94 % 17.90 % 10.34 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
174 MonoInsight 11.28 % 16.08 % 9.69 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
175 OPA-3D code 11.01 % 17.14 % 9.94 % 0.04 s 1 core @ 3.5 Ghz (Python)
176 GCDR 10.92 % 15.65 % 9.86 % 0.28 s 1 core @ 2.5 Ghz (Python)
177 LT-M3OD 10.89 % 16.63 % 9.20 % 0.03 s 1 core @ 2.5 Ghz (Python)
178 MonoASS 10.78 % 16.47 % 9.58 % 0.04 s 1 core @ 2.5 Ghz (Python)
179 MonoDTR 10.59 % 16.66 % 9.00 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
K. Huang, T. Wu, H. Su and W. Hsu: MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer. CVPR 2022.
180 BAIR 10.50 % 16.00 % 8.80 % 0.03 s 1 core @ 2.5 Ghz (Python)
181 BSM3D 10.41 % 15.30 % 8.89 % 0.03 s 1 core @ 2.5 Ghz (Python)
182 GUPNet code 10.37 % 15.62 % 8.79 % NA s 1 core @ 2.5 Ghz (Python + C/C++)
Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan and W. Ouyang: Geometry Uncertainty Projection Network for Monocular 3D Object Detection. arXiv preprint arXiv:2107.13774 2021.
183 CMKD code 10.28 % 16.03 % 8.85 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
Y. Hong, H. Dai and Y. Ding: Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection. ECCV 2022.
184 Lite-FPN-GUPNet 10.08 % 15.73 % 8.52 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
185 BCA 9.99 % 15.00 % 8.49 % 0.17 s GPU @ 2.5 Ghz (Python)
186 GPENet code 9.96 % 15.47 % 8.55 % 0.02 s GPU @ 2.5 Ghz (Python)
187 MM3D 9.90 % 15.37 % 8.23 % NA s 1 core @ 2.5 Ghz (C/C++)
188 DEVIANT code 9.77 % 14.49 % 8.28 % 0.04 s 1 GPU (Python)
A. Kumar, G. Brazil, E. Corona, A. Parchami and X. Liu: DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection. European Conference on Computer Vision (ECCV) 2022.
189 HBD 9.66 % 15.26 % 8.17 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
190 MonoNeRD 9.66 % 15.27 % 8.28 % na s 1 core @ 2.5 Ghz (C/C++)
191 MonoPCNS 9.65 % 15.56 % 8.27 % 0.14 s GPU @ 2.5 Ghz (Python)
192 Mono3DMethod 9.53 % 14.55 % 8.07 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
193 MonoAD 9.44 % 14.65 % 8.60 % 0.03 s GPU @ 2.5 Ghz (Python)
194 MonoA^2 9.42 % 13.82 % 7.99 % na s 1 core @ 2.5 Ghz (C/C++)
195 SARM3D 9.42 % 14.32 % 8.15 % 0.03 s GPU @ 2.5 Ghz (Python)
196 CaDDN code 9.41 % 14.72 % 8.17 % 0.63 s GPU @ 2.5 Ghz (Python)
C. Reading, A. Harakeh, J. Chae and S. Waslander: Categorical Depth Distribution Network for Monocular 3D Object Detection. CVPR 2021.
197 SGM3D code 9.39 % 15.39 % 8.61 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
Z. Zhou, L. Du, X. Ye, Z. Zou, X. Tan, L. Zhang, X. Xue and J. Feng: SGM3D: Stereo Guided Monocular 3D Object Detection. RA-L 2022.
198 AMNet 9.30 % 14.10 % 8.02 % 0.03 s GPU @ 1.0 Ghz (Python)
199 Anonymous 9.08 % 13.35 % 7.63 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
200 MonoRCNN++ code 9.04 % 13.45 % 7.74 % 0.07 s GPU @ 2.5 Ghz (Python)
X. Shi, Z. Chen and T. Kim: Multivariate Probabilistic Monocular 3D Object Detection. WACV 2023.
201 M3DGAF 8.93 % 13.42 % 7.58 % 0.07 s 1 core @ 2.5 Ghz (Python)
202 MonoXiver 8.93 % 13.75 % 7.61 % 0.03s GPU @ 2.5 Ghz (Python)
203 HomoLoss(monoflex) code 8.81 % 13.26 % 7.41 % 0.04 s 1 core @ 2.5 Ghz (Python)
J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homography Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
204 MonoDDE 8.41 % 12.38 % 7.16 % 0.04 s 1 core @ 2.5 Ghz (Python)
Z. Li, Z. Qu, Y. Zhou, J. Liu, H. Wang and L. Jiang: Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection. CVPR 2022.
205 SparseLiDAR_fusion 8.23 % 12.59 % 6.82 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
206 MDSNet 8.18 % 12.05 % 7.03 % 0.05 s 1 core @ 2.5 Ghz (Python)
Z. Xie, Y. Song, J. Wu, Z. Li, C. Song and Z. Xu: MDS-Net: Multi-Scale Depth Stratification 3D Object Detection from Monocular Images. Sensors 2022.
207 DCD code 8.08 % 11.76 % 6.61 % 1 s 1 core @ 2.5 Ghz (C/C++)
208 MonoAug 7.94 % 12.66 % 6.64 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
209 LPCG-Monoflex code 7.92 % 12.11 % 6.61 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
L. Peng, F. Liu, Z. Yu, S. Yan, D. Deng, Z. Yang, H. Liu and D. Cai: Lidar Point Cloud Guided Monocular 3D Object Detection. ECCV 2022.
210 RefinedMPL 7.92 % 13.09 % 7.25 % 0.15 s GPU @ 2.5 Ghz (Python + C/C++)
J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.
211 3DSeMoDLE code 7.71 % 11.86 % 6.43 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
212 Shape-Aware 7.65 % 11.69 % 6.35 % 0.05 s 1 core @ 2.5 Ghz (Python)
213 MonoRUn code 7.59 % 11.70 % 6.34 % 0.07 s GPU @ 2.5 Ghz (Python + C/C++)
H. Chen, Y. Huang, W. Tian, Z. Gao and L. Xiong: MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
214 MonoFlex 7.36 % 10.36 % 6.29 % 0.03 s GPU @ 2.5 Ghz (Python)
Y. Zhang, J. Lu and J. Zhou: Objects are Different: Flexible Monocular 3D Object Detection. CVPR 2021.
215 MonoPair 7.04 % 10.99 % 6.29 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
216 monodle code 6.96 % 10.73 % 6.20 % 0.04 s GPU @ 2.5 Ghz (Python)
X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li and W. Ouyang: Delving into Localization Errors for Monocular 3D Object Detection. CVPR 2021 .
217 MonoAug 6.87 % 10.81 % 5.66 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
218 TopNet-DecayRate
This method makes use of Velodyne laser scans.
6.59 % 8.78 % 6.25 % 92 ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
219 MDNet 6.18 % 9.48 % 5.63 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
220 MK3D 6.15 % 8.76 % 5.14 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
221 Shift R-CNN (mono) code 5.66 % 8.58 % 4.49 % 0.25 s GPU @ 1.5 Ghz (Python)
A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D Object Detection With Closed-form Geometric Constraints. ICIP 2019.
222 FMF-occlusion-net 5.62 % 8.69 % 5.25 % 0.16 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Liu, H. Liu, Y. Wang, F. Sun and W. Huang: Fine-grained Multi-level Fusion for Anti- occlusion Monocular 3D Object Detection. IEEE Transactions on Image Processing 2022.
223 Aug3D-RPN 5.22 % 7.14 % 4.21 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
C. He, J. Huang, X. Hua and L. Zhang: Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth. 2021.
224 TopNet-UncEst
This method makes use of Velodyne laser scans.
4.60 % 6.88 % 3.79 % 0.09 s NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, M. Braun, M. Lauer and C. Stiller: Capturing Object Detection Uncertainty in Multi-Layer Grid Maps. 2019.
225 MonoPSR code 4.56 % 7.24 % 4.11 % 0.2 s GPU @ 3.5 Ghz (Python)
J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction. CVPR 2019.
226 DFR-Net 4.52 % 6.66 % 3.71 % 0.18 s 1080 Ti (Pytorch)
Z. Zou, X. Ye, L. Du, X. Cheng, X. Tan, L. Zhang, J. Feng, X. Xue and E. Ding: The devil is in the task: Exploiting reciprocal appearance-localization features for monocular 3d object detection . ICCV 2021.
227 MoGDE 4.51 % 7.22 % 3.83 % 0.03 s GPU @ 2.5 Ghz (Python)
228 QD-3DT
This is an online method (no batch processing).
code 4.23 % 6.62 % 3.39 % 0.03 s GPU @ 2.5 Ghz (Python)
H. Hu, Y. Yang, T. Fischer, F. Yu, T. Darrell and M. Sun: Monocular Quasi-Dense 3D Object Tracking. ArXiv:2103.07351 2021.
229 M3D-RPN code 4.05 % 5.65 % 3.29 % 0.16 s GPU @ 1.5 Ghz (Python)
G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal Network for Object Detection . ICCV 2019 .
230 DDMP-3D 4.02 % 5.53 % 3.36 % 0.18 s 1 core @ 2.5 Ghz (Python)
L. Wang, L. Du, X. Ye, Y. Fu, G. Guo, X. Xue, J. Feng and L. Zhang: Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection. CVPR 2020.
231 CMAN 3.96 % 5.24 % 3.18 % 0.15 s 1 core @ 2.5 Ghz (Python)
C. Yuanzhouhan Cao: CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection. IEEE Trans. Intell. Transport. Syst. 2022.
232 D4LCN code 3.86 % 5.06 % 3.59 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for Monocular 3D Object Detection. CVPR 2020.
233 RT3DStereo
This method uses stereo information.
3.65 % 4.72 % 3.00 % 0.08 s GPU @ 2.5 Ghz (C/C++)
H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.
234 MonoEF 3.05 % 4.61 % 2.85 % 0.03 s 1 core @ 2.5 Ghz (Python)
Y. Zhou, Y. He, H. Zhu, C. Wang, H. Li and Q. Jiang: Monocular 3D Object Detection: An Extrinsic Parameter Free Approach. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
235 SS3D 2.09 % 2.48 % 1.61 % 48 ms Tesla V100 (Python)
E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. CoRR 2019.
236 SparVox3D 2.05 % 2.90 % 1.69 % 0.05 s GPU @ 2.0 Ghz (Python)
E. Balatkan and F. Kıraç: Improving Regression Performance on Monocular 3D Object Detection Using Bin-Mixing and Sparse Voxel Data. 2021 6th International Conference on Computer Science and Engineering (UBMK) 2021.
237 CDTrack3D code 1.91 % 2.56 % 1.49 % 0.0106 s NVIDIA RTX 3090 GPU, i9 10850k CPU
238 PGD-FCOS3D code 1.88 % 2.82 % 1.54 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
T. Wang, X. Zhu, J. Pang and D. Lin: Probabilistic and Geometric Depth: Detecting Objects in Perspective. Conference on Robot Learning (CoRL) 2021.
239 SSAL-Mono 1.53 % 2.18 % 1.54 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
240 mBoW
This method makes use of Velodyne laser scans.
0.00 % 0.00 % 0.00 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
Table as LaTeX | Only published Methods

Cyclist


Method Setting Code Moderate Easy Hard Runtime Environment
1 BiProDet 78.19 % 89.65 % 71.13 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
2 CasA++ code 76.99 % 88.93 % 70.10 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
3 TED 76.95 % 89.54 % 70.31 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, C. Wen, W. Li, R. Yang and C. Wang: Transformation-Equivariant 3D Object Detection for Autonomous Driving. AAAI 2023.
4 CasA code 75.74 % 88.99 % 68.47 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
H. Wu, J. Deng, C. Wen, X. Li and C. Wang: CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing 2022.
5 LoGoNet 74.92 % 85.85 % 67.62 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
6 HMFI code 74.06 % 85.69 % 67.11 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
X. Li, B. Shi, Y. Hou, X. Wu, T. Ma, Y. Li and L. He: Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection. ECCV 2022.
7 CZY_PPF_Net2 73.64 % 85.39 % 66.01 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
8 EQ-PVRCNN code 73.30 % 86.25 % 65.49 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, L. Jiang, Y. Sun, B. Schiele and J. Jia: A Unified Query-based Paradigm for Point Cloud Understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.
9 Semantical PVRCNN 73.14 % 86.75 % 64.87 % 0.07 s 1 core @ 2.5 Ghz (C/C++)
10 VoCo 73.08 % 85.29 % 66.46 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
11 SPT 72.90 % 86.10 % 65.13 % 0.1 s GPU @ 2.5 Ghz (Python)
12 CAD 72.87 % 87.09 % 65.78 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
13 DCAN-Second code 72.74 % 88.62 % 65.89 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
14 CZY_PPF_Net 72.73 % 86.92 % 65.30 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
15 DSA-PV-RCNN
This method makes use of Velodyne laser scans.
code 72.61 % 83.93 % 65.82 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
P. Bhattacharyya, C. Huang and K. Czarnecki: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection. 2021.
16 CAT-Det 72.51 % 85.35 % 65.55 % 0.3 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Zhang, J. Chen and D. Huang: CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection. CVPR 2022.
17 Reprod-Two-Branch 72.16 % 87.50 % 64.41 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
18 CFF-tv 72.02 % 86.54 % 64.25 % 1 s 1 core @ 2.5 Ghz (C/C++)
19 CFF-ep25 71.99 % 86.78 % 64.18 % 1 s 1 core @ 2.5 Ghz (C/C++)
20 PA-RCNN code 71.98 % 86.09 % 64.02 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
21 SGDA3D 71.90 % 84.81 % 64.88 % 0.07 s 1 core @ 2.5 Ghz (Python)
22 Under Blind Review#2 71.89 % 84.41 % 65.15 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
23 BtcDet
This method makes use of Velodyne laser scans.
code 71.76 % 84.48 % 64.70 % 0.09 s GPU @ 2.5 Ghz (Python + C/C++)
Q. Xu, Y. Zhong and U. Neumann: Behind the Curtain: Learning Occluded Shapes for 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence 2022.
24 CF-ctdep-tv-ta 71.74 % 87.38 % 64.30 % 1 s 1 core @ 2.5 Ghz (C/C++)
25 cff-tv-v2-ep25 71.70 % 85.61 % 64.15 % 1 s 1 core @ 2.5 Ghz (C/C++)
26 PointPainting
This method makes use of Velodyne laser scans.
71.54 % 83.91 % 62.97 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
S. Vora, A. Lang, B. Helou and O. Beijbom: PointPainting: Sequential Fusion for 3D Object Detection. CVPR 2020.
27 CFF-tv-v2 71.53 % 85.70 % 63.77 % 1 s 1 core @ 2.5 Ghz (C/C++)
28 RangeIoUDet
This method makes use of Velodyne laser scans.
71.49 % 85.99 % 63.62 % 0.02 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liang, Z. Zhang, M. Zhang, X. Zhao and S. Pu: RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union. CVPR 2021.
29 ACDet code 71.48 % 87.76 % 64.69 % 0.05 s 1 core @ 2.5 Ghz (C/C++)
J. Xu, G. Wang, X. Zhang and G. Wan: ACDet: Attentive Cross-view Fusion for LiDAR-based 3D Object Detection. 3DV 2022.
30 IA-SSD (single) code 71.44 % 85.91 % 63.41 % 0.013 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
31 3SNet 71.44 % 84.55 % 64.79 % 0.07 s GPU @ 2.5 Ghz (Python)
32 Anonymous
This method makes use of Velodyne laser scans.
71.43 % 84.75 % 64.89 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
33 PDV code 71.31 % 85.54 % 64.40 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Hu, T. Kuai and S. Waslander: Point Density-Aware Voxels for LiDAR 3D Object Detection. CVPR 2022.
34 HVNet 71.17 % 83.97 % 63.65 % 0.03 s GPU @ 2.0 Ghz (Python)
M. Ye, S. Xu and T. Cao: HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection. CVPR 2020.
35 M3DeTR code 70.89 % 85.03 % 63.14 % n/a s GPU @ 1.0 Ghz (Python)
T. Guan, J. Wang, S. Lan, R. Chandra, Z. Wu, L. Davis and D. Manocha: M3DeTR: Multi-representation, Multi- scale, Mutual-relation 3D Object Detection with Transformers. 2021.
36 USVLab BSAODet 70.85 % 85.28 % 64.09 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
37 CZY_3917 70.73 % 83.46 % 63.16 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
38 CenterFuse 70.59 % 86.53 % 62.18 % 0.059 sec/frame 2 x V100
39 KPSCC code 70.59 % 83.06 % 63.07 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
40 CZY 70.32 % 86.42 % 63.32 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
41 TCDVF 70.28 % 82.85 % 63.54 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
42 USVLab BSAODet (S) 70.24 % 84.38 % 63.24 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
43 MVMM code 70.17 % 81.84 % 63.84 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
44 CF-ctdep-tv 70.16 % 86.31 % 62.63 % 1 s 1 core @ 2.5 Ghz (C/C++)
45 SPG_mini
This method makes use of Velodyne laser scans.
code 70.09 % 82.66 % 63.61 % 0.09 s GPU @ 2.5 Ghz (Python)
Q. Xu, Y. Zhou, W. Wang, C. Qi and D. Anguelov: SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation. Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) 2021.
46 TBD 70.09 % 82.60 % 63.39 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
47 VGA-RCNN 69.86 % 80.95 % 62.16 % 0.07 s 1 core @ 2.5 Ghz (Python)
48 FV2P v2 69.82 % 86.88 % 63.09 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
49 KeyFuse2B 69.76 % 84.95 % 62.16 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
50 IKT3D
This method makes use of Velodyne laser scans.
69.74 % 81.92 % 62.59 % 0.05 s 1 core @ 2.5 Ghz (Python)
51 CF-base-tv 69.49 % 84.12 % 61.85 % 1 s 1 core @ 2.5 Ghz (C/C++)
52 DGT-Det3D code 69.47 % 81.26 % 61.88 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
53 PVTr 69.46 % 84.62 % 62.33 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
54 HPV-RCNN 69.43 % 82.51 % 61.87 % 0.15 s 1 core @ 2.5 Ghz (Python)
55 IoU-2B 69.24 % 86.64 % 60.57 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
56 TBD 69.09 % 82.53 % 62.57 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
57 IPS 68.94 % 83.95 % 61.33 % TBD s 1 core @ 2.5 Ghz (C/C++)
58 MMLab PV-RCNN
This method makes use of Velodyne laser scans.
code 68.89 % 82.49 % 62.41 % 0.08 s 1 core @ 2.5 Ghz (Python + C/C++)
S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li: PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. CVPR 2020.
59 F-ConvNet
This method makes use of Velodyne laser scans.
code 68.88 % 84.16 % 60.05 % 0.47 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Wang and K. Jia: Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. IROS 2019.
60 MMLab-PartA^2
This method makes use of Velodyne laser scans.
code 68.73 % 83.43 % 61.85 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, Z. Wang, J. Shi, X. Wang and H. Li: From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.
61 WGVRF 68.71 % 82.04 % 62.04 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
62 U_RVRCNN_V2_1 68.65 % 80.63 % 61.90 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
63 CF-cd-io-tv 68.52 % 83.71 % 60.12 % 1 s 1 core @ 2.5 Ghz (C/C++)
64 HotSpotNet 68.51 % 83.29 % 61.84 % 0.04 s 1 core @ 2.5 Ghz (Python + C/C++)
Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille: object as hotspots. Proceedings of the European Conference on Computer Vision (ECCV) 2020.
65 TBD 68.33 % 85.17 % 61.38 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
66 LightCPC code 68.24 % 84.48 % 61.82 % 0.02 s 1 core @ 2.5 Ghz (Python + C/C++)
67 BASA 68.22 % 81.97 % 61.48 % 1s 1 core @ 2.5 Ghz (python)
68 P2V-RCNN 68.06 % 81.09 % 60.73 % 0.1 s 2 cores @ 2.5 Ghz (Python)
J. Li, S. Luo, Z. Zhu, H. Dai, A. Krylov, Y. Ding and L. Shao: P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds. IEEE Access 2021.
69 KPP3D code 67.97 % 81.23 % 60.72 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
70 H^23D R-CNN code 67.90 % 82.76 % 60.49 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
J. Deng, W. Zhou, Y. Zhang and H. Li: From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2021.
71 Anonymous 67.83 % 81.75 % 60.92 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
72 Self-Calib Conv 67.73 % 82.11 % 60.57 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
73 Anonymous 67.73 % 81.50 % 60.87 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
74 VPFNet code 67.66 % 80.83 % 61.36 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
C. Wang, H. Chen and L. Fu: VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection. 2021.
75 3DSSD code 67.62 % 85.04 % 61.14 % 0.04 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu and J. Jia: 3DSSD: Point-based 3D Single Stage Object Detector. CVPR 2020.
76 GS-FPS 67.57 % 80.63 % 61.45 % TBD s 1 core @ 2.5 Ghz (C/C++)
77 Fast-CLOCs 67.55 % 83.34 % 59.61 % 0.1 s GPU @ 2.5 Ghz (Python)
S. Pang, D. Morris and H. Radha: Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022.
78 NV-RCNN 67.54 % 82.53 % 60.97 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
79 U_PVRCNN_V2 67.51 % 79.04 % 59.98 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
80 DGT-Det3D 67.44 % 80.73 % 60.71 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
81 cff-tv-t 67.41 % 85.91 % 60.15 % 1 s 1 core @ 2.5 Ghz (C/C++)
82 DVFENet 67.40 % 82.29 % 60.71 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang: DVFENet: Dual-branch Voxel Feature Extraction Network for 3D Object Detection. Neurocomputing 2021.
83 FromVoxelToPoint code 67.36 % 82.68 % 59.15 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, H. Dai, L. Shao and Y. Ding: From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to- Point Decoder. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
84 AGS-SSD[la] 67.35 % 81.70 % 60.41 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
85 Point-GNN
This method makes use of Velodyne laser scans.
code 67.28 % 81.17 % 59.67 % 0.6 s GPU @ 2.5 Ghz (Python)
W. Shi and R. Rajkumar: Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. CVPR 2020.
86 MMLab-PointRCNN
This method makes use of Velodyne laser scans.
code 67.24 % 82.56 % 60.28 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
S. Shi, X. Wang and H. Li: Pointrcnn: 3d object proposal generation and detection from point cloud. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019.
87 STD code 67.23 % 81.36 % 59.35 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia: STD: Sparse-to-Dense 3D Object Detector for Point Cloud. ICCV 2019.
88 Anonymous 67.06 % 80.51 % 60.13 % 0.02 s 1 core @ 2.5 Ghz (C/C++)
89 SVGA-Net 66.82 % 81.25 % 59.37 % 0.03s 1 core @ 2.5 Ghz (Python + C/C++)
Q. He, Z. Wang, H. Zeng, Y. Zeng and Y. Liu: SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds. AAAI 2022.
90 PSA-SSD 66.79 % 79.56 % 59.94 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
91 KeyPoint-IoUHead 66.72 % 83.32 % 59.93 % 0.01 s 1 core @ 2.5 Ghz (C/C++)
92 S-AT GCN 66.71 % 78.53 % 60.19 % 0.02 s GPU @ 2.0 Ghz (Python)
L. Wang, C. Wang, X. Zhang, T. Lan and J. Li: S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection. CoRR 2021.
93 T_PVRCNN_V2 66.49 % 80.88 % 58.51 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
94 ATT_SSD 66.41 % 80.73 % 60.16 % 0.01 s 1 core @ 2.5 Ghz (Python)
95 cp-tv-kp-io-sc 66.40 % 82.88 % 58.53 % 1 s 1 core @ 2.5 Ghz (C/C++)
96 ARPNET 66.39 % 82.32 % 58.80 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Ye, C. Zhang and X. Hao: ARPNET: attention region proposal network for 3D object detection. Science China Information Sciences 2019.
97 IA-SSD (multi) code 66.29 % 81.30 % 59.58 % 0.014 s 1 core @ 2.5 Ghz (C/C++)
Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan and Y. Guo: Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. CVPR 2022.
98 T_PVRCNN 66.17 % 79.84 % 59.04 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
99 cp-tv 66.08 % 80.65 % 58.98 % 1 s 1 core @ 2.5 Ghz (C/C++)
100 SWA code 66.08 % 78.96 % 60.18 % 0.18 s 1 core @ 2.5 Ghz (C/C++)
101 GEO_LOC 66.08 % 79.15 % 58.56 % TBD s 1 core @ 2.5 Ghz (C/C++)
102 MGAF-3DSSD code 66.00 % 83.03 % 57.57 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
J. Li, H. Dai, L. Shao and Y. Ding: Anchor-free 3D Single Stage Detector with Mask-Guided Attention for Point Cloud. MM '21: The 29th ACM International Conference on Multimedia (ACM MM) 2021.
103 AB3DMOT
This method makes use of Velodyne laser scans.
This is an online method (no batch processing).
code 65.85 % 80.00 % 58.69 % 0.0047s 1 core @ 2.5 Ghz (Python)
X. Weng and K. Kitani: A Baseline for 3D Multi-Object Tracking. arXiv:1907.03961 2019.
104 U_SECOND_V4 65.84 % 80.94 % 58.31 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
105 PSA-Det3D 65.51 % 79.21 % 59.06 % 0.1 s GPU @ 2.5 Ghz (Python)
106 TTT_SSD 65.31 % 78.56 % 59.27 % TBD s 1 core @ 2.5 Ghz (C/C++)
107 ZMMPP 65.23 % 77.62 % 58.08 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
108 VPNet 64.95 % 79.83 % 58.33 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
109 cp-tv-kp 64.87 % 79.91 % 58.22 % 1 s 1 core @ 2.5 Ghz (C/C++)
110 GS-FPS-LT 64.86 % 79.49 % 58.93 % TBD s 1 core @ 2.5 Ghz (C/C++)
111 PVRCNN_8369 64.56 % 79.60 % 57.94 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
112 Faraway-Frustum
This method makes use of Velodyne laser scans.
code 64.54 % 79.65 % 57.84 % 0.1 s GPU @ 2.5 Ghz (Python)
H. Zhang, D. Yang, E. Yurtsever, K. Redmill and U. Ozguner: Faraway-frustum: Dealing with lidar sparsity for 3D object detection using fusion. 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) 2021.
113 Dune-DCF-e11 64.52 % 82.14 % 57.40 % 1 s 1 core @ 2.5 Ghz (C/C++)
114 SRDL 64.52 % 79.64 % 57.90 % 0.05 s 1 core @ 2.5 Ghz (Python + C/C++)
ERROR: Wrong syntax in BIBTEX file.
115 Dune-DCF-e15 64.42 % 81.10 % 57.35 % 1 s 1 core @ 2.5 Ghz (C/C++)
116 City-CF-fixed 64.39 % 81.11 % 57.74 % 1 s 1 core @ 2.5 Ghz (C/C++)
117 CF-ctdep-train 64.33 % 81.02 % 56.17 % 1 s 1 core @ 2.5 Ghz (C/C++)
118 City-CF 64.25 % 81.33 % 57.01 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
119 SIF 64.13 % 79.32 % 57.38 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
P. An: SIF. Submitted to CVIU 2021.
120 AFTD 64.03 % 82.99 % 55.93 % 1 s 1 core @ 2.5 Ghz (Python + C/C++)
121 SECOND_7862 63.95 % 78.30 % 57.28 % 1 s 1 core @ 2.5 Ghz (Python)
122 variance_point 63.90 % 78.49 % 56.51 % 0.05 s 1 core @ 2.5 Ghz (Python)
123 TANet code 63.77 % 79.16 % 56.21 % 0.035s GPU @ 2.5 Ghz (Python + C/C++)
Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai: TANet: Robust 3D Object Detection from Point Clouds with Triple Attention. AAAI 2020.
124 CF-base-train 63.63 % 80.31 % 55.56 % 1 s 1 core @ 2.5 Ghz (C/C++)
125 DTE3D 63.10 % 79.79 % 56.94 % 0.19 s 1 core @ 2.5 Ghz (C/C++)
126 XView 63.06 % 81.32 % 56.65 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
L. Xie, G. Xu, D. Cai and X. He: X-view: Non-egocentric Multi-View 3D Object Detector. 2021.
127 EPNet++ 62.94 % 78.57 % 56.62 % 0.1 s GPU @ 2.5 Ghz (Python)
Z. Liu, H. tengteng, B. Li, X. Chen, X. Wang and X. Bai: EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection. arXiv preprint arXiv:2112.11088 2021.
128 PointPillars
This method makes use of Velodyne laser scans.
code 62.73 % 79.90 % 55.58 % 16 ms 1080ti GPU and Intel i7 CPU
A. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom: PointPillars: Fast Encoders for Object Detection from Point Clouds. CVPR 2019.
129 Dune-DCF-e09 62.23 % 77.53 % 55.46 % 1 s 1 core @ 2.5 Ghz (C/C++)
130 CAD
This method uses stereo information.
This method makes use of Velodyne laser scans.
62.05 % 77.71 % 54.56 % 0.1 s 1 core @ 2.5 Ghz (Python + C/C++)
131 CrazyTensor-CF 61.95 % 80.59 % 55.16 % 1 s 1 core @ 2.5 Ghz (C/C++)
132 LazyTorch-CP-Infer-O 61.40 % 76.40 % 54.73 % 1 s 1 core @ 2.5 Ghz (C/C++)
133 F-PointNet
This method makes use of Velodyne laser scans.
code 61.37 % 77.26 % 53.78 % 0.17 s GPU @ 3.0 Ghz (Python)
C. Qi, W. Liu, C. Wu, H. Su and L. Guibas: Frustum PointNets for 3D Object Detection from RGB-D Data. arXiv preprint arXiv:1711.08488 2017.
134 CenterPoint (pcdet) 61.25 % 76.38 % 54.68 % 0.051 sec/frame 2 x V100
135 LazyTorch-CP-Small-P 61.07 % 76.37 % 54.73 % 1 s 1 core @ 2.5 Ghz (C/C++)
136 epBRM
This method makes use of Velodyne laser scans.
code 59.79 % 75.13 % 53.36 % 0.10 s 1 core @ 2.5 Ghz (C/C++)
K. Shin: Improving a Quality of 3D Object Detection by Spatial Transformation Mechanism. arXiv preprint arXiv:1910.04853 2019.
137 BirdNet+
This method makes use of Velodyne laser scans.
code 59.58 % 70.84 % 54.20 % 0.11 s Titan Xp (PyTorch)
A. Barrera, J. Beltrán, C. Guindel, J. Iglesias and F. García: BirdNet+: Two-Stage 3D Object Detection in LiDAR through a Sparsity-Invariant Bird’s Eye View. IEEE Access 2021.
138 CrazyTensor-CP 59.54 % 75.40 % 53.21 % 1 s 1 core @ 2.5 Ghz (Python)
139 DMF
This method uses stereo information.
57.99 % 71.92 % 51.55 % 0.2 s 1 core @ 2.5 Ghz (Python + C/C++)
X. J. Chen and W. Xu: Disparity-Based Multiscale Fusion Network for Transportation Detection. IEEE Transactions on Intelligent Transportation Systems 2022.
140 PointRGBNet 57.59 % 73.09 % 51.78 % 0.08 s 4 cores @ 2.5 Ghz (Python + C/C++)
P. Xie Desheng: Real-time Detection of 3D Objects Based on Multi-Sensor Information Fusion. Automotive Engineering 2022.
141 AVOD-FPN
This method makes use of Velodyne laser scans.
code 57.12 % 69.39 % 51.09 % 0.1 s Titan X (Pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
142 PiFeNet code 56.94 % 72.80 % 50.04 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
D. Le, H. Shi, H. Rezatofighi and J. Cai: Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network. arXiv preprint arXiv:2112.15458 2022.
143 CZY 56.71 % 70.64 % 50.35 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
144 SCNet
This method makes use of Velodyne laser scans.
56.39 % 73.73 % 49.99 % 0.04 s GPU @ 3.0 Ghz (Python)
Z. Wang, H. Fu, L. Wang, L. Xiao and B. Dai: SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019.
145 PFF3D
This method makes use of Velodyne laser scans.
code 55.71 % 72.67 % 49.58 % 0.05 s GPU @ 3.0 Ghz (Python + C/C++)
L. Wen and K. Jo: Fast and Accurate 3D Object Detection for Lidar-Camera-Based Autonomous Vehicles Using One Shared Voxel-Based Backbone. IEEE Access 2021.
146 MLOD
This method makes use of Velodyne laser scans.
code 55.06 % 73.03 % 48.21 % 0.12 s GPU @ 1.5 Ghz (Python)
J. Deng and K. Czarnecki: MLOD: A multi-view 3D object detection based on robust feature fusion method. arXiv preprint arXiv:1909.04163 2019.
147 BirdNet+ (legacy)
This method makes use of Velodyne laser scans.
code 52.15 % 72.45 % 46.57 % 0.1 s Titan Xp (PyTorch)
A. Barrera, C. Guindel, J. Beltrán and F. García: BirdNet+: End-to-End 3D Object Detection in LiDAR Bird’s Eye View. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
148 DSGN++
This method uses stereo information.
code 49.37 % 68.29 % 43.79 % 0.2 s GeForce RTX 2080Ti
Y. Chen, S. Huang, S. Liu, B. Yu and J. Jia: DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors. arXiv preprint arXiv:2204.03039 2022.
149 StereoDistill 48.37 % 69.46 % 42.69 % 0.4 s 1 core @ 2.5 Ghz (Python)
150 AVOD
This method makes use of Velodyne laser scans.
code 48.15 % 64.11 % 42.37 % 0.08 s Titan X (pascal)
J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. Waslander: Joint 3D Proposal Generation and Object Detection from View Aggregation. IROS 2018.
151 BirdNet
This method makes use of Velodyne laser scans.
41.56 % 58.64 % 36.94 % 0.11 s Titan Xp (Caffe)
J. Beltrán, C. Guindel, F. Moreno, D. Cruzado, F. García and A. Escalera: BirdNet: A 3D Object Detection Framework from LiDAR Information. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
152 SparsePool code 40.74 % 56.52 % 36.68 % 0.13 s 8 cores @ 2.5 Ghz (Python)
Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and front view camera image for deep object detection. arXiv preprint arXiv:1711.06703 2017.
153 MMLAB LIGA-Stereo
This method uses stereo information.
code 40.60 % 58.95 % 35.27 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
X. Guo, S. Shi, X. Wang and H. Li: LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021.
154 TopNet-Retina
This method makes use of Velodyne laser scans.
36.83 % 47.48 % 33.58 % 52ms GeForce 1080Ti (tensorflow-gpu, v1.12)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
155 CG-Stereo
This method uses stereo information.
36.25 % 55.33 % 32.17 % 0.57 s GeForce RTX 2080 Ti
C. Li, J. Ku and S. Waslander: Confidence Guided Stereo 3D Object Detection with Split Depth Estimation. IROS 2020.
156 Pseudo-Stereo++ 35.75 % 54.06 % 31.17 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
157 SparsePool code 35.24 % 43.55 % 30.15 % 0.13 s 8 cores @ 2.5 Ghz (Python)
Z. Wang, W. Zhan and M. Tomizuka: Fusing bird view lidar point cloud and front view camera image for deep object detection. arXiv preprint arXiv:1711.06703 2017.
158 PS 32.16 % 49.23 % 27.73 % 0.4 s 1 core @ 2.5 Ghz (Python + C/C++)
159 Disp R-CNN (velo)
This method uses stereo information.
code 27.04 % 44.19 % 23.58 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
160 Disp R-CNN
This method uses stereo information.
code 27.04 % 44.19 % 23.58 % 0.387 s GPU @ 2.5 Ghz (Python + C/C++)
J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao: Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation. CVPR 2020.
161 Complexer-YOLO
This method makes use of Velodyne laser scans.
25.43 % 32.00 % 22.88 % 0.06 s GPU @ 3.5 Ghz (C/C++)
M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz and H. Michael Gross: Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019.
162 DSGN
This method uses stereo information.
code 21.04 % 31.23 % 18.93 % 0.67 s NVIDIA Tesla V100
Y. Chen, S. Liu, X. Shen and J. Jia: DSGN: Deep Stereo Geometry Network for 3D Object Detection. CVPR 2020.
163 OC Stereo
This method uses stereo information.
code 19.23 % 32.47 % 17.11 % 0.35 s 1 core @ 2.5 Ghz (Python + C/C++)
A. Pon, J. Ku, C. Li and S. Waslander: Object-Centric Stereo Matching for 3D Object Detection. ICRA 2020.
164 TopNet-DecayRate
This method makes use of Velodyne laser scans.
16.00 % 23.02 % 13.24 % 92 ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
165 RT3D-GMP
This method uses stereo information.
13.92 % 20.59 % 12.74 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
H. Königshof and C. Stiller: Learning-Based Shape Estimation with Grid Map Patches for Realtime 3D Object Detection for Automated Driving. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) 2020.
166 TopNet-UncEst
This method makes use of Velodyne laser scans.
9.18 % 12.31 % 8.14 % 0.09 s NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, M. Braun, M. Lauer and C. Stiller: Capturing Object Detection Uncertainty in Multi-Layer Grid Maps. 2019.
167 ESGN
This method uses stereo information.
9.02 % 15.78 % 7.96 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
A. Gao, Y. Pang, J. Nie, Z. Shao, J. Cao, Y. Guo and X. Li: ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection. IEEE Transactions on Circuits and Systems for Video Technology 2022.
168 CMKD code 8.15 % 14.66 % 7.23 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
Y. Hong, H. Dai and Y. Ding: Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection. ECCV 2022.
169 PS-fld code 7.29 % 12.80 % 6.05 % 0.25 s 1 core @ 2.5 Ghz (C/C++)
Y. Chen, H. Dai and Y. Ding: Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
170 Anonymous 7.24 % 12.53 % 6.21 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
171 DD3Dv2 code 7.02 % 10.67 % 5.78 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
172 SSAL-Mono 6.70 % 10.55 % 5.76 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
173 TopNet-HighRes
This method makes use of Velodyne laser scans.
6.48 % 9.99 % 6.76 % 101ms NVIDIA GeForce 1080 Ti (tensorflow-gpu)
S. Wirges, T. Fischer, C. Stiller and J. Frias: Object Detection and Classification in Occupancy Grid Maps Using Deep Convolutional Networks. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) 2018.
174 BSM3D 6.42 % 10.59 % 5.50 % 0.03 s 1 core @ 2.5 Ghz (Python)
175 MonoPSR code 5.78 % 9.87 % 4.57 % 0.2 s GPU @ 3.5 Ghz (Python)
J. Ku*, A. Pon* and S. Waslander: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction. CVPR 2019.
176 DD3D code 5.69 % 9.20 % 5.20 % n/a s 1 core @ 2.5 Ghz (C/C++)
D. Park, R. Ambrus, V. Guizilini, J. Li and A. Gaidon: Is Pseudo-Lidar needed for Monocular 3D Object detection?. IEEE/CVF International Conference on Computer Vision (ICCV) .
177 MonoASS 5.57 % 9.42 % 5.02 % 0.04 s 1 core @ 2.5 Ghz (Python)
178 LT-M3OD 5.53 % 9.17 % 4.84 % 0.03 s 1 core @ 2.5 Ghz (Python)
179 CaDDN code 5.38 % 9.67 % 4.75 % 0.63 s GPU @ 2.5 Ghz (Python)
C. Reading, A. Harakeh, J. Chae and S. Waslander: Categorical Depth Distribution Network for Monocular 3D Object Detection. CVPR 2021.
180 Anonymous 5.17 % 7.71 % 4.31 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
181 LPCG-Monoflex code 4.90 % 8.14 % 3.86 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
L. Peng, F. Liu, Z. Yu, S. Yan, D. Deng, Z. Yang, H. Liu and D. Cai: Lidar Point Cloud Guided Monocular 3D Object Detection. ECCV 2022.
182 MonoAD 4.85 % 8.13 % 4.71 % 0.03 s GPU @ 2.5 Ghz (Python)
183 MDNet 4.74 % 8.10 % 4.19 % 0.2 s 1 core @ 2.5 Ghz (C/C++)
184 DEPT 4.71 % 8.82 % 4.15 % 0.03 s 1 core @ 2.5 Ghz (Python)
185 Lite-FPN-GUPNet 4.70 % 7.67 % 4.61 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
186 Shape-Aware 4.60 % 8.00 % 4.50 % 0.05 s 1 core @ 2.5 Ghz (Python)
187 MM3D 4.50 % 8.45 % 3.56 % NA s 1 core @ 2.5 Ghz (C/C++)
188 3DSeMoDLE code 4.47 % 7.51 % 3.84 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
189 BCA 4.42 % 6.89 % 3.91 % 0.17 s GPU @ 2.5 Ghz (Python)
190 MonoDDE 4.36 % 6.68 % 3.76 % 0.04 s 1 core @ 2.5 Ghz (Python)
Z. Li, Z. Qu, Y. Zhou, J. Liu, H. Wang and L. Jiang: Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection. CVPR 2022.
191 GPENet code 4.28 % 7.06 % 3.68 % 0.02 s GPU @ 2.5 Ghz (Python)
192 SparseLiDAR_fusion 4.26 % 7.77 % 3.45 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
193 MonoDTR 4.11 % 5.84 % 3.48 % 0.04 s 1 core @ 2.5 Ghz (C/C++)
K. Huang, T. Wu, H. Su and W. Hsu: MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer. CVPR 2022.
194 RT3DStereo
This method uses stereo information.
4.10 % 7.03 % 3.88 % 0.08 s GPU @ 2.5 Ghz (C/C++)
H. Königshof, N. Salscheider and C. Stiller: Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information. Proc. IEEE Intl. Conf. Intelligent Transportation Systems 2019.
195 HomoLoss(monoflex) code 4.09 % 6.81 % 3.78 % 0.04 s 1 core @ 2.5 Ghz (Python)
J. Gu, B. Wu, L. Fan, J. Huang, S. Cao, Z. Xiang and X. Hua: Homography Loss for Monocular 3D Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
196 DFR-Net 4.00 % 5.99 % 3.95 % 0.18 s 1080 Ti (Pytorch)
Z. Zou, X. Ye, L. Du, X. Cheng, X. Tan, L. Zhang, J. Feng, X. Xue and E. Ding: The devil is in the task: Exploiting reciprocal appearance-localization features for monocular 3d object detection . ICCV 2021.
197 MonoInsight 3.99 % 6.56 % 3.49 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
198 DEVIANT code 3.97 % 6.42 % 3.51 % 0.04 s 1 GPU (Python)
A. Kumar, G. Brazil, E. Corona, A. Parchami and X. Liu: DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection. European Conference on Computer Vision (ECCV) 2022.
199 SARM3D 3.85 % 5.59 % 3.28 % 0.03 s GPU @ 2.5 Ghz (Python)
200 GUPNet code 3.85 % 6.94 % 3.64 % NA s 1 core @ 2.5 Ghz (Python + C/C++)
Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, J. Yan and W. Ouyang: Geometry Uncertainty Projection Network for Monocular 3D Object Detection. arXiv preprint arXiv:2107.13774 2021.
201 MoGDE 3.76 % 6.04 % 3.09 % 0.03 s GPU @ 2.5 Ghz (Python)
202 OPA-3D code 3.75 % 6.01 % 3.56 % 0.04 s 1 core @ 3.5 Ghz (Python)
203 CIE 3.74 % 6.13 % 3.18 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
204 MonoAug 3.71 % 5.66 % 3.00 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
205 SGM3D code 3.63 % 7.05 % 3.33 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
Z. Zhou, L. Du, X. Ye, Z. Zou, X. Tan, L. Zhang, X. Xue and J. Feng: SGM3D: Stereo Guided Monocular 3D Object Detection. RA-L 2022.
206 DCD code 3.62 % 5.84 % 3.33 % 1 s 1 core @ 2.5 Ghz (C/C++)
207 AMNet 3.61 % 5.54 % 3.19 % 0.03 s GPU @ 1.0 Ghz (Python)
208 Aug3D-RPN 3.33 % 5.44 % 2.82 % 0.08 s 1 core @ 2.5 Ghz (C/C++)
C. He, J. Huang, X. Hua and L. Zhang: Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth. 2021.
209 monodle code 3.28 % 5.34 % 2.83 % 0.04 s GPU @ 2.5 Ghz (Python)
X. Ma, Y. Zhang, D. Xu, D. Zhou, S. Yi, H. Li and W. Ouyang: Delving into Localization Errors for Monocular 3D Object Detection. CVPR 2021 .
210 MDSNet 3.22 % 5.99 % 2.62 % 0.05 s 1 core @ 2.5 Ghz (Python)
Z. Xie, Y. Song, J. Wu, Z. Li, C. Song and Z. Xu: MDS-Net: Multi-Scale Depth Stratification 3D Object Detection from Monocular Images. Sensors 2022.
211 MonoXiver 3.17 % 4.66 % 2.69 % 0.03s GPU @ 2.5 Ghz (Python)
212 DDMP-3D 3.14 % 4.92 % 2.44 % 0.18 s 1 core @ 2.5 Ghz (Python)
L. Wang, L. Du, X. Ye, Y. Fu, G. Guo, X. Xue, J. Feng and L. Zhang: Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection. CVPR 2020.
213 BAIR 3.07 % 4.48 % 2.73 % 0.03 s 1 core @ 2.5 Ghz (Python)
214 MonoA^2 3.04 % 5.41 % 2.67 % na s 1 core @ 2.5 Ghz (C/C++)
215 QD-3DT
This is an online method (no batch processing).
code 3.02 % 5.71 % 2.73 % 0.03 s GPU @ 2.5 Ghz (Python)
H. Hu, Y. Yang, T. Fischer, F. Yu, T. Darrell and M. Sun: Monocular Quasi-Dense 3D Object Tracking. ArXiv:2103.07351 2021.
216 M3DGAF 3.02 % 5.33 % 2.87 % 0.07 s 1 core @ 2.5 Ghz (Python)
217 MonoPair 2.87 % 4.76 % 2.42 % 0.06 s GPU @ 2.5 Ghz (Python + C/C++)
Y. Chen, L. Tai, K. Sun and M. Li: MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
218 MonoNeRD 2.80 % 5.24 % 2.55 % na s 1 core @ 2.5 Ghz (C/C++)
219 MonoFlex 2.67 % 4.41 % 2.50 % 0.03 s GPU @ 2.5 Ghz (Python)
Y. Zhang, J. Lu and J. Zhou: Objects are Different: Flexible Monocular 3D Object Detection. CVPR 2021.
220 MK3D 2.55 % 4.17 % 2.24 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
221 Mono3DMethod 2.50 % 4.09 % 2.15 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
222 MonoAug 2.46 % 4.31 % 2.21 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
223 MonoPCNS 2.46 % 4.65 % 2.42 % 0.14 s GPU @ 2.5 Ghz (Python)
224 RefinedMPL 2.42 % 4.23 % 2.14 % 0.15 s GPU @ 2.5 Ghz (Python + C/C++)
J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.
225 MonoRCNN++ code 2.31 % 3.50 % 2.01 % 0.07 s GPU @ 2.5 Ghz (Python)
X. Shi, Z. Chen and T. Kim: Multivariate Probabilistic Monocular 3D Object Detection. WACV 2023.
226 GCDR 2.11 % 3.74 % 1.99 % 0.28 s 1 core @ 2.5 Ghz (Python)
227 SS3D 1.89 % 3.45 % 1.44 % 48 ms Tesla V100 (Python)
E. Jörgensen, C. Zach and F. Kahl: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss. CoRR 2019.
228 D4LCN code 1.82 % 2.72 % 1.79 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
M. Ding, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo: Learning Depth-Guided Convolutions for Monocular 3D Object Detection. CVPR 2020.
229 PGD-FCOS3D code 1.79 % 3.54 % 1.56 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
T. Wang, X. Zhu, J. Pang and D. Lin: Probabilistic and Geometric Depth: Detecting Objects in Perspective. Conference on Robot Learning (CoRL) 2021.
230 FMF-occlusion-net 1.65 % 1.91 % 1.75 % 0.16 s 1 core @ 2.5 Ghz (Python + C/C++)
H. Liu, H. Liu, Y. Wang, F. Sun and W. Huang: Fine-grained Multi-level Fusion for Anti- occlusion Monocular 3D Object Detection. IEEE Transactions on Image Processing 2022.
231 HBD 1.64 % 3.15 % 1.76 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
232 CMAN 1.48 % 1.76 % 1.17 % 0.15 s 1 core @ 2.5 Ghz (Python)
C. Yuanzhouhan Cao: CMAN: Leaning Global Structure Correlation for Monocular 3D Object Detection. IEEE Trans. Intell. Transport. Syst. 2022.
233 MonoEF 1.18 % 2.36 % 1.15 % 0.03 s 1 core @ 2.5 Ghz (Python)
Y. Zhou, Y. He, H. Zhu, C. Wang, H. Li and Q. Jiang: Monocular 3D Object Detection: An Extrinsic Parameter Free Approach. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
234 M3D-RPN code 0.81 % 1.25 % 0.78 % 0.16 s GPU @ 1.5 Ghz (Python)
G. Brazil and X. Liu: M3D-RPN: Monocular 3D Region Proposal Network for Object Detection . ICCV 2019 .
235 MonoRUn code 0.73 % 1.14 % 0.66 % 0.07 s GPU @ 2.5 Ghz (Python + C/C++)
H. Chen, Y. Huang, W. Tian, Z. Gao and L. Xiong: MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021.
236 Shift R-CNN (mono) code 0.38 % 0.76 % 0.41 % 0.25 s GPU @ 1.5 Ghz (Python)
A. Naiden, V. Paunescu, G. Kim, B. Jeon and M. Leordeanu: Shift R-CNN: Deep Monocular 3D Object Detection With Closed-form Geometric Constraints. ICIP 2019.
237 CDTrack3D code 0.10 % 0.24 % 0.11 % 0.0106 s NVIDIA RTX 3090 GPU, i9 10850k CPU
238 mBoW
This method makes use of Velodyne laser scans.
0.00 % 0.00 % 0.00 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
Table as LaTeX | Only published Methods

Related Datasets

Citation

When using this dataset in your research, we will be happy if you cite us:
@INPROCEEDINGS{Geiger2012CVPR,
  author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
  title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2012}
}



eXTReMe Tracker