Object Detection Evaluation 2012


The object detection and object orientation estimation benchmark consists of 7481 training images and 7518 test images, comprising a total of 80.256 labeled objects. All images are color and saved as png. For evaluation, we compute precision-recall curves for object detection and orientation-similarity-recall curves for joint object detection and orientation estimation. In the latter case not only the object 2D bounding box has to be located correctly, but also the orientation estimate in bird's eye view is evaluated. To rank the methods we compute average precision and average orientation similiarity. We require that all methods use the same parameter set for all test pairs. Our development kit provides details about the data format as well as MATLAB / C++ utility functions for reading and writing the label files.

We evaluate object detection performance using the PASCAL criteria and object detection and orientation estimation performance using the measure discussed in our CVPR 2012 publication. For cars we require an overlap of 70%, while for pedestrians and cyclists we require an overlap of 50% for a detection. Detections in don't care areas or detections which are smaller than the minimum size do not count as false positive. Difficulties are defined as follows:

  • Easy: Min. bounding box height: 40 Px, Max. occlusion level: Fully visible, Max. truncation: 15 %
  • Moderate: Min. bounding box height: 25 Px, Max. occlusion level: Partly occluded, Max. truncation: 30 %
  • Hard: Min. bounding box height: 25 Px, Max. occlusion level: Difficult to see, Max. truncation: 50 %

All methods are ranked based on the moderately difficult results. Note that for the hard evaluation ~2 % of the provided bounding boxes have not been recognized by humans, thereby upper bounding recall at 98 %. Hence, the hard evaluation is only given for reference.
Note 1: On 25.04.2017, we have fixed a bug in the object detection evaluation script. As of now, the submitted detections are filtered based on the min. bounding box height for the respective category which we have been done before only for the ground truth detections, thus leading to false positives for the category "Easy" when bounding boxes of height 25-39 Px were submitted (and to false positives for all categories if bounding boxes smaller than 25 Px were submitted). We like to thank Amy Wu, Matt Wilder, Pekka Jänis and Philippe Vandermersch for their feedback. The last leaderboards right before the changes can be found here!

Additional information used by the methods
  • Stereo: Method uses left and right (stereo) images
  • Flow: Method uses optical flow (2 temporally adjacent images)
  • Multiview: Method uses more than 2 temporally adjacent images
  • Laser Points: Method uses point clouds from Velodyne laser scanner
  • Additional training data: Use of additional data sources for training (see details)

Car


Method Setting Code Moderate Easy Hard Runtime Environment
1 iDST-VC 90.55 % 90.88 % 81.04 % 4 s GPU @ 2.5 Ghz (Python + C/C++)
2 BM-NET 90.48 % 90.83 % 80.63 % 4.0 s GPU @ 2.5 Ghz (C/C++)
3 Genome 90.43 % 90.85 % 81.97 % 4 s GPU @ 2.5 Ghz (C/C++)
4 SAIT 90.36 % 90.78 % 80.48 % 0.15 s GPU @ >3.5 Ghz (Python + C/C++)
5 TuSimple 90.33 % 90.77 % 82.86 % 1.6 s GPU @ 2.5 Ghz (Python + C/C++)
6 eagle 90.25 % 90.77 % 85.20 % 4 s GPU @ 2.5 Ghz (C/C++)
7 RRC code 90.22 % 90.61 % 87.44 % 3.6 s GPU @ 2.5 Ghz (Python + C/C++)
J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y. Tai and L. Xu: Accurate Single Stage Detector Using Recurrent Rolling Convolution. CVPR 2017.
8 RV-CNN 90.14 % 90.76 % 84.96 % 3.5 s GPU @ 2.5 Ghz (Python + C/C++)
9 DuEye 90.11 % 90.65 % 85.46 % 4 s GPU @ 2.5 Ghz (C/C++)
10 Direwolf 90.04 % 90.71 % 80.63 % 0.5 s GPU @ 2.5 Ghz (C/C++)
11 Deep MANTA 90.03 % 97.25 % 80.62 % 0.7 s GPU @ 2.5 Ghz (Python + C/C++)
F. Chabot, M. Chaouch, J. Rabarisoa, C. Teulière and T. Chateau: Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image. CVPR 2017.
12 sensekitti 90.00 % 90.76 % 81.83 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
13 SAITv2 89.91 % 95.04 % 79.91 % 0.07 s GPU @ 2.5 Ghz (Python + C/C++)
14 NVDriveNet-H 89.81 % 90.09 % 80.08 % 0.15s GPU @ 2.5 Ghz (Python + C/C++)
15 Allspark 89.73 % 90.64 % 79.16 % 0.7 s GPU @ 2.5 Ghz (C/C++)
16 SINet+ 89.63 % 90.55 % 77.64 % 0.3 s GPU @ 2.5 Ghz (Matlab + C/C++)
17 SINet_VGG 89.56 % 90.60 % 78.19 % 0.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
18 SDP+RPN 89.42 % 89.90 % 78.54 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 2015.
19 DJML 89.40 % 90.51 % 79.58 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
20 RaC 89.39 % 90.02 % 80.29 % 1s s GPU @ 1.0 Ghz (C/C++)
21 Pie 89.23 % 90.19 % 77.98 % 1.2 s 1 core @ 2.5 Ghz (C/C++)
22 uickitti 89.17 % 90.80 % 79.58 % 1.5 s GPU @ 2.5 Ghz (C/C++)
23 SINet_PVA 89.08 % 90.44 % 75.85 % 0.11 s GPU @ 2.5 Ghz (Matlab + C/C++)
24 HSR2 88.98 % 90.76 % 78.62 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
25 MV3D
This method makes use of Velodyne laser scans.
88.90 % 90.37 % 79.81 % 0.36 s GPU @ 2.5 Ghz (Python + C/C++)
X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.
26 SubCNN 88.86 % 90.75 % 79.24 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
27 Deep3DBox 88.86 % 90.47 % 77.60 % 1.5 s GPU @ 2.5 Ghz (C/C++)
A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.
28 MS-CNN code 88.83 % 90.46 % 74.76 % 0.4 s GPU @ 2.5 Ghz (C/C++)
Z. Cai, Q. Fan, R. Feris and N. Vasconcelos: A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. ECCV 2016.
29 DeepStereoOP 88.75 % 90.34 % 79.39 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
30 CPCD 88.73 % 90.28 % 79.20 % 3 s 1 core @ 2.5 Ghz (C/C++)
31 HM_SSD_RCNN 88.69 % 90.47 % 77.86 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
32 Re-3DOP 88.46 % 90.27 % 78.93 % 3 s 1 core @ 2.5 Ghz (C/C++)
33 3DOP
This method uses stereo information.
code 88.34 % 90.09 % 78.79 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
34 HM3D 88.26 % 89.86 % 78.24 % 0.35 s GPU @ >3.5 Ghz (C/C++)
35 MM-MRFC
This method uses optical flow information.
This method makes use of Velodyne laser scans.
88.20 % 90.93 % 78.02 % 0.05 s GPU @ 2.5 Ghz (C/C++)
36 SYVO 88.14 % 89.60 % 71.05 % 0.13 s GPU @ 2.5 Ghz (C/C++)
37 Mono3D code 87.86 % 90.27 % 78.09 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
38 WRInception 87.62 % 88.98 % 77.52 % 0.06 s GPU @ 2.5 Ghz (C/C++)
39 CNN based 87.42 % 88.76 % 77.55 % 1s 1 core @ 2.5 Ghz (C/C++)
40 UI 87.34 % 89.56 % 71.16 % 0.4 s GPU @ 2.5 Ghz (C/C++)
41 tbd 86.97 % 90.10 % 77.94 % 1 s 1 core @ 2.5 Ghz (C/C++)
42 TWSNet 86.30 % 90.03 % 71.36 % 0.48 s GPU @ 3.5 Ghz (Matlab + C/C++)
43 VCTNet 85.97 % 89.37 % 75.89 % 0.18 s GPU @ 3.5 GHz (C/C++)
44 ANM 85.61 % 86.80 % 77.20 % 0.05 s GPU @ 2.5 Ghz (C/C++)
45 LPN 81.67 % 87.70 % 72.69 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
46 SDP+CRC (ft) 81.33 % 90.39 % 70.33 % 0.6 s GPU @ 2.5 Ghz (C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
47 ANM 81.29 % 85.23 % 69.32 % 0.05 s GPU @ 2.5 Ghz (C/C++)
48 MV3D (LIDAR)
This method makes use of Velodyne laser scans.
79.76 % 89.80 % 78.61 % 0.24 s GPU @ 2.5 Ghz (Python + C/C++)
X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.
49 RefineNet 79.21 % 90.16 % 65.71 % 0.20 s GPU @ 2.5 Ghz (Matlab + C++)
R. Rajaram, E. Bar and M. Trivedi: RefineNet: Iterative Refinement for Accurate Object Localization. Intelligent Transportation Systems Conference 2016.
50 Faster R-CNN code 79.11 % 87.90 % 70.19 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks. NIPS 2015.
51 PNET 78.98 % 85.21 % 71.20 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
52 FRCNN+Or 78.59 % 89.60 % 68.69 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
53 HM 77.72 % 87.90 % 61.36 % 1 s 1 core @ 2.5 Ghz (C/C++)
54 spLBP 77.39 % 80.16 % 60.59 % 1.5 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
Q. Hu, S. Paisitkriangkrai, C. Shen, A. Hengel and F. Porikli: Fast Detection of Multiple Objects in Traffic Scenes With a Common Detection Framework. IEEE Trans. Intelligent Transportation Systems 2016.
55 SceneNet 77.34 % 87.90 % 68.38 % 0.03 s GPU @ 2.5 Ghz (C/C++)
56 Reinspect code 76.65 % 88.36 % 66.56 % 2s 1 core @ 2.5 Ghz (C/C++)
R. Stewart, M. Andriluka and A. Ng: End-to-End People Detection in Crowded Scenes. CVPR 2016.
57 Regionlets 76.56 % 86.50 % 59.82 % 1 s >8 cores @ 2.5 Ghz (C/C++)
X. Wang, M. Yang, S. Zhu and Y. Lin: Regionlets for Generic Object Detection. T-PAMI 2015.
W. Zou, X. Wang, M. Sun and Y. Lin: Generic Object Detection with Dense Neural Patterns and Regionlets. British Machine Vision Conference 2014.
C. Long, X. Wang, G. Hua, M. Yang and Y. Lin: Accurate Object Detection with Location Relaxation and Regionlets Relocalization. Asian Conference on Computer Vision 2014.
58 AOG code 75.97 % 85.58 % 60.96 % 3 s 4 cores @ 2.5 Ghz (Matlab)
T. Wu, B. Li and S. Zhu: Learning And-Or Models to Represent Context and Occlusion for Car Detection and Viewpoint Estimation. TPAMI 2016.
B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.
59 3D FCN
This method makes use of Velodyne laser scans.
75.83 % 85.54 % 68.30 % >5 s 1 core @ 2.5 Ghz (C/C++)
B. Li: 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. IROS 2017.
60 3DVP code 75.77 % 81.46 % 65.38 % 40 s 8 cores @ 3.5 Ghz (Matlab + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Data-Driven 3D Voxel Patterns for Object Category Recognition. IEEE Conference on Computer Vision and Pattern Recognition 2015.
61 Pose-RCNN 75.74 % 88.89 % 61.86 % 2 s >8 cores @ 2.5 Ghz (Python)
62 AR-FCN 75.49 % 81.24 % 66.00 % 0.19 s GPU @ 2.5 Ghz (C/C++)
63 SubCat code 75.46 % 81.45 % 59.71 % 0.7 s 6 cores @ 3.5 Ghz (Matlab + C/C++)
E. Ohn-Bar and M. Trivedi: Learning to Detect Vehicles by Clustering Appearance Patterns. T-ITS 2015.
64 FD2 74.68 % 87.14 % 65.70 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
65 FD 72.64 % 82.34 % 60.31 % 0.01 s GPU @ >3.5 Ghz (Python)
66 SmartCNN 71.10 % 77.00 % 56.97 % 1 s 1 core @ 2.5 Ghz (C/C++)
67 FCNN 70.67 % 88.04 % 61.50 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
68 MV-RGBD-RF
This method makes use of Velodyne laser scans.
69.92 % 76.49 % 57.47 % 4 s 4 cores @ 2.5 Ghz (C/C++)
A. Gonzalez, D. Vazquez, A. Lopez and J. Amores: On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts.. IEEE Trans. on Cybernetics 2016.
A. Gonzalez, G. Villalonga, J. Xu, D. Vazquez, J. Amores and A. Lopez: Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. IEEE Intelligent Vehicles Symposium (IV) 2015.
69 AOG-View 69.89 % 84.29 % 57.25 % 3 s 1 core @ 2.5 Ghz (Matlab, C/C++)
B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.
70 YOLOv2 code 69.01 % 86.40 % 59.57 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
71 Vote3Deep
This method makes use of Velodyne laser scans.
68.39 % 76.95 % 63.22 % 1.5 s 4 cores @ 2.5 Ghz (C/C++)
M. Engelcke, D. Rao, D. Zeng Wang, C. Hay Tong and I. Posner: Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. ArXiv e-prints 2016.
72 ZGC 68.02 % 85.19 % 58.88 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
73 GPVL 67.89 % 77.76 % 58.23 % 10 s 1 core @ 2.5 Ghz (C/C++)
74 GVPL 67.78 % 77.74 % 57.69 % 1 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
75 QHY 67.69 % 85.23 % 58.64 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
76 BdCost48LDCF 67.07 % 77.93 % 51.15 % 5 s 1 core @ 2.5 Ghz (C/C++)
77 OC-DPM 66.45 % 76.16 % 53.70 % 10 s 8 cores @ 2.5 Ghz (Matlab)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Occlusion Patterns for Object Class Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013.
78 DPM-VOC+VP 66.25 % 80.45 % 49.86 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
79 BdCost48-25C 65.95 % 78.21 % 51.23 % 4 s 1 core @ 2.5 Ghz (C/C++)
80 RCNN 65.94 % 84.47 % 51.00 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
81 HL 64.94 % 77.55 % 50.53 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
82 LCNN 64.74 % 79.18 % 56.79 % 1 s 1 core @ 2.5 Ghz (C/C++)
83 MDPM-un-BB 64.20 % 77.32 % 50.18 % 60 s 4 core @ 2.5 Ghz (MATLAB)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
84 BNet 63.24 % 75.09 % 56.05 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
85 GNN 62.59 % 76.03 % 50.18 % 0.2 s 1 core @ 2.5 Ghz (Python)
86 NMF-CNN 62.21 % 77.32 % 49.29 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
87 NMRDO 61.72 % 79.48 % 54.06 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
88 SubCat48LDCF 61.57 % 74.14 % 48.18 % 5 s 1 core @ 2.5 Ghz (Matlab + C/C++)
89 DPM-C8B1
This method uses stereo information.
60.99 % 74.95 % 47.16 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
90 HgCNN 60.72 % 73.77 % 52.95 % 1 s 1 core @ 2.5 Ghz (C/C++)
91 GNN 58.29 % 76.26 % 49.96 % 0.2 s 1 core @ 2.5 Ghz (Python)
92 LSVM-MDPM-sv 57.44 % 71.70 % 46.58 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
93 Faster RCNN 56.58 % 62.31 % 45.27 % 0.11 s GPU @ 2.5 Ghz (Python)
94 LSVM-MDPM-us code 56.10 % 70.52 % 42.87 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
95 ACF-SC 55.76 % 69.76 % 46.27 % <0.3 s 1 core @ >3.5 Ghz (Matlab + C/C++)
C. Cadena, A. Dick and I. Reid: A Fast, Modular Scene Understanding System using Context-Aware Object Detection. Robotics and Automation (ICRA), 2015 IEEE International Conference on 2015.
96 frd 55.70 % 69.86 % 48.53 % 2 s 1 core @ 2.5 Ghz (C/C++)
97 FRO 53.78 % 70.96 % 46.00 % 0.19 s GPU @ 2.5 Ghz (Python)
98 MLSmoke 53.54 % 77.41 % 44.63 % 1 s 1 core @ 2.5 Ghz (C/C++)
99 VeloFCN
This method makes use of Velodyne laser scans.
53.45 % 70.68 % 46.90 % 1 s GPU @ 2.5 Ghz (Python + C/C++)
B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .
100 ACF 52.81 % 62.82 % 43.89 % 0.2 s 1 core @ >3.5 Ghz (Matlab + C/C++)
P. Doll\'ar, R. Appel, S. Belongie and P. Perona: Fast Feature Pyramids for Object Detection. PAMI 2014.
P. Doll\'ar: Piotr's Image and Video Matlab Toolbox (PMT). .
101 Vote3D
This method makes use of Velodyne laser scans.
48.05 % 56.66 % 42.64 % 0.5 s 4 cores @ 2.8 Ghz (C/C++)
D. Wang and I. Posner: Voting for Voting in Online Point Cloud Object Detection. Proceedings of Robotics: Science and Systems 2015.
102 YOLO 35.86 % 49.47 % 29.74 % 0.03 s GPU @ 1.0 Ghz (C/C++)
103 CSoR
This method makes use of Velodyne laser scans.
code 26.13 % 35.24 % 22.69 % 3.5 s 4 cores @ >3.5 Ghz (Python + C/C++)
L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.
104 R-CNN_VGG 26.04 % 32.23 % 20.93 % 10 s GPU @ 2.5 Ghz (Matlab + C/C++)
105 FCN-Depth code 25.66 % 50.55 % 24.95 % 1 s GPU @ 1.5 Ghz (Matlab + C/C++)
106 mBoW
This method makes use of Velodyne laser scans.
23.76 % 37.63 % 18.44 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
107 YOLOv2 code 19.31 % 28.37 % 15.94 % 0.02 s GPU @ 3.5 Ghz (C/C++)
J. Redmon, S. Divvala, R. Girshick and A. Farhadi: You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016.
J. Redmon and A. Farhadi: YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
Table as LaTeX | Only published Methods

Pedestrian


Method Setting Code Moderate Easy Hard Runtime Environment
1 TuSimple 77.04 % 86.78 % 72.40 % 1.6 s GPU @ 2.5 Ghz (Python + C/C++)
2 RRC code 75.33 % 84.14 % 70.39 % 3.6 s GPU @ 2.5 Ghz (Python + C/C++)
J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y. Tai and L. Xu: Accurate Single Stage Detector Using Recurrent Rolling Convolution. CVPR 2017.
3 iFDT 74.83 % 86.02 % 70.55 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
4 Allspark 74.22 % 84.44 % 68.61 % 0.7 s GPU @ 2.5 Ghz (C/C++)
5 TiCNN 74.07 % 84.00 % 68.50 % 0.5 s GPU @ 2.5 Ghz (Matlab + C/C++)
6 MS-CNN code 73.62 % 83.70 % 68.28 % 0.4 s GPU @ 2.5 Ghz (C/C++)
Z. Cai, Q. Fan, R. Feris and N. Vasconcelos: A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. ECCV 2016.
7 Pie 73.17 % 84.25 % 67.55 % 1.2 s 1 core @ 2.5 Ghz (C/C++)
8 SAIT 72.61 % 84.79 % 67.94 % 0.15 s GPU @ >3.5 Ghz (Python + C/C++)
9 VCTNet 72.32 % 82.05 % 66.39 % 0.18 s GPU @ 3.5 GHz (C/C++)
10 uickitti 71.84 % 83.45 % 67.00 % 1.5 s GPU @ 2.5 Ghz (C/C++)
11 GN 71.55 % 80.73 % 64.82 % 1 s GPU @ 2.5 Ghz (Matlab + C/C++)
12 SubCNN 71.34 % 83.17 % 66.36 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
13 IVA code 70.63 % 83.03 % 64.68 % 0.4 s GPU @ 2.5 Ghz (C/C++)
Y. Zhu, J. Wang, C. Zhao, H. Guo and H. Lu: Scale-adaptive Deconvolutional Regression Network for Pedestrian Detection. ACCV 2016.
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 2015.
14 SDP+RPN 70.20 % 79.98 % 64.84 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 2015.
15 MM-MRFC
This method uses optical flow information.
This method makes use of Velodyne laser scans.
69.96 % 82.37 % 64.76 % 0.05 s GPU @ 2.5 Ghz (C/C++)
16 WRInception 68.76 % 79.98 % 63.48 % 0.06 s GPU @ 2.5 Ghz (C/C++)
17 3DOP
This method uses stereo information.
code 67.46 % 82.36 % 64.71 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
18 DeepStereoOP 67.32 % 82.50 % 65.14 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
19 sensekitti 67.28 % 80.12 % 62.25 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
20 Re-3DOP 67.24 % 81.51 % 64.02 % 3 s 1 core @ 2.5 Ghz (C/C++)
21 Mono3D code 66.66 % 77.30 % 63.44 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
22 IVA code 66.50 % 75.89 % 61.60 % 1 s GPU @ 2.5 Ghz (Matlab + C/C++)
23 HM_SSD_RCNN 66.41 % 82.33 % 59.21 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
24 HM3D 65.97 % 77.60 % 61.09 % 0.35 s GPU @ >3.5 Ghz (C/C++)
25 HSR2 65.91 % 78.05 % 63.05 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
26 Faster R-CNN code 65.91 % 78.35 % 61.19 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks. NIPS 2015.
27 Tx 65.08 % 77.36 % 59.43 % 2 s GPU @ 2.5 Ghz (Matlab + C/C++)
28 DJML 64.93 % 77.15 % 58.96 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
29 PNET 64.66 % 75.71 % 60.41 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
30 tbd 64.56 % 79.59 % 61.27 % 1 s 1 core @ 2.5 Ghz (C/C++)
31 SDP+CRC (ft) 64.25 % 77.81 % 59.31 % 0.6 s GPU @ 2.5 Ghz (C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
32 Pose-RCNN 63.38 % 77.69 % 57.42 % 2 s >8 cores @ 2.5 Ghz (Python)
33 CFM 63.26 % 74.21 % 56.44 % <2 s GPU @ 2.5 Ghz (Matlab + C/C++)
Q. Hu, P. Wang, C. Shen, A. Hengel and F. Porikli: Pushing the Limits of Deep CNNs for Pedestrian Detection. IEEE Transactions on Circuits and Systems for Video Technology 2017.
34 PCN 62.08 % 74.71 % 56.68 % 0.6 s
35 RPN+BF code 61.29 % 75.58 % 56.08 % 0.6 s GPU @ 2.5 Ghz (Matlab + C/C++)
L. Zhang, L. Lin, X. Liang and K. He: Is Faster R-CNN Doing Well for Pedestrian Detection?. ECCV 2016.
36 Regionlets 61.16 % 72.96 % 55.22 % 1 s >8 cores @ 2.5 Ghz (C/C++)
X. Wang, M. Yang, S. Zhu and Y. Lin: Regionlets for Generic Object Detection. T-PAMI 2015.
W. Zou, X. Wang, M. Sun and Y. Lin: Generic Object Detection with Dense Neural Patterns and Regionlets. British Machine Vision Conference 2014.
C. Long, X. Wang, G. Hua, M. Yang and Y. Lin: Accurate Object Detection with Location Relaxation and Regionlets Relocalization. Asian Conference on Computer Vision 2014.
37 RB 61.15 % 77.08 % 55.12 % 0.6 s GPU @ 2.5 Ghz (Matlab + C/C++)
38 LC 60.68 % 71.98 % 54.47 % 1 s 1 core @ 2.5 Ghz (Matlab + C/C++)
39 ens 60.64 % 72.30 % 54.59 %
40 CompACT-Deep 58.73 % 69.70 % 52.69 % 1 s 1 core @ 2.5 Ghz (Matlab + C/C++)
Z. Cai, M. Saberian and N. Vasconcelos: Learning Complexity-Aware Cascades for Deep Pedestrian Detection. ICCV 2015.
41 FichaDet 58.72 % 70.16 % 53.01 % 0.2 s 4 cores @ 2.5 Ghz (C/C++)
42 DeepParts 58.68 % 70.46 % 52.73 % ~1 s GPU @ 2.5 Ghz (Matlab)
Y. Tian, P. Luo, X. Wang and X. Tang: Deep Learning Strong Parts for Pedestrian Detection. ICCV 2015.
43 LPN 58.18 % 70.54 % 54.18 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
44 FilteredICF 57.12 % 69.05 % 51.46 % ~ 2 s >8 cores @ 2.5 Ghz (Matlab + C/C++)
S. Zhang, R. Benenson and B. Schiele: Filtered Channel Features for Pedestrian Detection. CVPR 2015.
45 FRCNN+Or 56.99 % 72.21 % 53.72 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
46 p2dv 56.98 % 68.71 % 50.99 % 1 s 1 core @ 2.5 Ghz (C/C++)
47 D-TSF 56.77 % 69.03 % 50.77 % 1 s 1 core @ 2.5 Ghz (C/C++)
ERROR: Wrong syntax in BIBTEX file.
48 FD2 56.68 % 71.09 % 51.65 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
49 MV-RGBD-RF
This method makes use of Velodyne laser scans.
56.59 % 73.05 % 49.63 % 4 s 4 cores @ 2.5 Ghz (C/C++)
A. Gonzalez, D. Vazquez, A. Lopez and J. Amores: On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts.. IEEE Trans. on Cybernetics 2016.
A. Gonzalez, G. Villalonga, J. Xu, D. Vazquez, J. Amores and A. Lopez: Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. IEEE Intelligent Vehicles Symposium (IV) 2015.
50 ACNet+Cascad 56.23 % 66.17 % 50.67 % 2.5 s 1 core @ 3.5 Ghz (Matlab)
51 Vote3Deep
This method makes use of Velodyne laser scans.
55.38 % 67.94 % 52.62 % 1.5 s 4 cores @ 2.5 Ghz (C/C++)
M. Engelcke, D. Rao, D. Zeng Wang, C. Hay Tong and I. Posner: Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. ArXiv e-prints 2016.
52 FD 55.33 % 67.87 % 50.02 % 0.01 s GPU @ >3.5 Ghz (Python)
53 pAUCEnsT 54.58 % 66.11 % 48.49 % 60 s 1 core @ 2.5 Ghz (Matlab + C/C++)
S. Paisitkriangkrai, C. Shen and A. Hengel: Pedestrian Detection with Spatially Pooled Features and Structured Ensemble Learning. arXiv 2014.
54 ANM 54.02 % 70.43 % 49.83 % 0.05 s GPU @ 2.5 Ghz (C/C++)
55 PDV2 53.74 % 65.71 % 49.47 % 3.7 s 1 core @ 3.0 Ghz Matlab (C/C++)
J. Shen, X. Zuo, J. Li, W. Yang and H. Ling: A novel pixel neighborhood differential statistic feature for pedestrian and face detection . Pattern Recognition 2017.
56 ANM 52.55 % 69.86 % 51.13 % 0.05 s GPU @ 2.5 Ghz (C/C++)
57 HM 51.89 % 68.95 % 43.86 % 1 s 1 core @ 2.5 Ghz (C/C++)
58 ACFD code 50.91 % 61.59 % 45.51 % 0.2 s 4 cores @ >3.5 Ghz (C/C++)
59 ZGC 50.42 % 67.07 % 42.79 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
60 R-CNN 50.20 % 62.05 % 44.85 % 4 s GPU @ 3.3 Ghz (C/C++)
J. Hosang, M. Omran, R. Benenson and B. Schiele: Taking a Deeper Look at Pedestrians. arXiv 2015.
61 SSD1 50.14 % 63.93 % 47.46 % 0.255 s GPU @ 2.5 Ghz (python+ C/C++)
62 NMF-CNN 49.26 % 65.16 % 45.38 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
63 ACF 47.29 % 60.11 % 42.90 % 1 s 1 core @ 3.5 Ghz (Matlab + C/C++)
P. Doll\'ar, R. Appel, S. Belongie and P. Perona: Fast Feature Pyramids for Object Detection. PAMI 2014.
64 Fusion-DPM
This method makes use of Velodyne laser scans.
code 46.67 % 59.38 % 42.05 % ~ 30 s 1 core @ 3.5 Ghz (Matlab + C/C++)
C. Premebida, J. Carreira, J. Batista and U. Nunes: Pedestrian Detection Combining RGB and Dense LIDAR Data. IROS 2014.
65 ACF-MR 46.23 % 58.85 % 42.10 % 0.6 s 1 core @ 3.5 Ghz (C/C++)
R. Rajaram, E. Ohn-Bar and M. Trivedi: Looking at Pedestrians at Different Scales: A Multi-resolution Approach and Evaluations. T-ITS 2016.
66 HA-SSVM 45.51 % 58.91 % 41.08 % 21 s 1 core @ >3.5 Ghz (Matlab + C/C++)
J. Xu, S. Ramos, D. Vázquez and A. López: Hierarchical Adaptive Structural SVM for Domain Adaptation. IJCV 2016.
67 FRO 45.43 % 57.56 % 40.50 % 0.19 s GPU @ 2.5 Ghz (Python)
68 DPM-VOC+VP 44.86 % 59.60 % 40.37 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
69 ACF-SC 44.77 % 54.20 % 39.57 % <0.3 s 1 core @ >3.5 Ghz (Matlab + C/C++)
C. Cadena, A. Dick and I. Reid: A Fast, Modular Scene Understanding System using Context-Aware Object Detection. Robotics and Automation (ICRA), 2015 IEEE International Conference on 2015.
70 SquaresICF code 44.42 % 57.47 % 40.08 % 1 s GPU @ >3.5 Ghz (C/C++)
R. Benenson, M. Mathias, T. Tuytelaars and L. Gool: Seeking the strongest rigid detector. CVPR 2013.
71 AR-FCN 43.88 % 53.16 % 35.58 % 0.19 s GPU @ 2.5 Ghz (C/C++)
72 QHY 43.42 % 60.19 % 42.31 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
73 YOLOv2 code 43.33 % 53.02 % 35.41 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
74 GNN 42.56 % 58.22 % 40.53 % 0.2 s 1 core @ 2.5 Ghz (Python)
75 SubCat 42.34 % 54.06 % 37.95 % 1.2 s 6 cores @ 2.5 Ghz (Matlab + C/C++)
E. Ohn-Bar and M. Trivedi: Fast and Robust Object Detection Using Visual Subcategories. Computer Vision and Pattern Recognition Workshops Mobile Vision 2014.
76 HL 42.31 % 58.63 % 34.87 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
77 RCNN 42.17 % 58.48 % 34.88 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
78 Fast-RCNN-SS 41.59 % 54.20 % 35.26 % 1 s GPU @ 2.0 Ghz (Matlab + C/C++)
79 GNN 40.69 % 55.22 % 38.65 % 0.2 s 1 core @ 2.5 Ghz (Python)
80 ACF 40.62 % 49.08 % 36.66 % 0.2 s 1 core @ >3.5 Ghz (Matlab + C/C++)
P. Doll\'ar, R. Appel, S. Belongie and P. Perona: Fast Feature Pyramids for Object Detection. PAMI 2014.
P. Doll\'ar: Piotr's Image and Video Matlab Toolbox (PMT). .
81 NMRDO 40.59 % 55.43 % 39.75 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
82 ACFK code 40.23 % 48.83 % 33.57 % 0.07 s 1 core @ >3.5 Ghz (C/C++)
83 ACF_M 39.36 % 51.75 % 35.95 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
84 LSVM-MDPM-sv 39.36 % 51.75 % 35.95 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
85 PCNN 39.07 % 53.60 % 37.91 % 1 s 1 core @ 2.5 Ghz (C/C++)
86 CNN 38.98 % 52.85 % 38.31 % 1 s 1 core @ 2.5 Ghz (C/C++)
87 LSVM-MDPM-us code 38.35 % 50.01 % 34.78 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
88 Vote3D
This method makes use of Velodyne laser scans.
35.74 % 44.47 % 33.72 % 0.5 s 4 cores @ 2.8 Ghz (C/C++)
D. Wang and I. Posner: Voting for Voting in Online Point Cloud Object Detection. Proceedings of Robotics: Science and Systems 2015.
89 BNet 34.40 % 41.05 % 28.88 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
90 mBoW
This method makes use of Velodyne laser scans.
31.37 % 44.36 % 30.62 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
91 DPM-C8B1
This method uses stereo information.
29.03 % 38.96 % 25.61 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
92 YOLO 24.35 % 25.63 % 17.50 % 0.03 s GPU @ 1.0 Ghz (C/C++)
93 R-CNN_VGG 23.16 % 28.95 % 22.17 % 10 s GPU @ 2.5 Ghz (Matlab + C/C++)
94 YOLOv2 code 16.19 % 20.80 % 15.43 % 0.02 s GPU @ 3.5 Ghz (C/C++)
J. Redmon, S. Divvala, R. Girshick and A. Farhadi: You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016.
J. Redmon and A. Farhadi: YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
95 BIP-HETERO 13.38 % 14.85 % 13.25 % ~2 s 1 core @ 2.5 Ghz (C/C++)
A. Mekonnen, F. Lerasle, A. Herbulot and C. Briand: People Detection with Heterogeneous Features and Explicit Optimization on Computation Time. Pattern Recognition (ICPR), 2014 22nd International Conference on 2014.
Table as LaTeX | Only published Methods

Cyclist


Method Setting Code Moderate Easy Hard Runtime Environment
1 VCTNet 77.44 % 85.39 % 68.52 % 0.18 s GPU @ 3.5 GHz (C/C++)
2 RRC code 76.47 % 84.96 % 65.46 % 3.6 s GPU @ 2.5 Ghz (Python + C/C++)
J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y. Tai and L. Xu: Accurate Single Stage Detector Using Recurrent Rolling Convolution. CVPR 2017.
3 SAIT 75.88 % 83.99 % 66.45 % 0.15 s GPU @ >3.5 Ghz (Python + C/C++)
4 Pie 75.86 % 81.70 % 66.99 % 1.2 s 1 core @ 2.5 Ghz (C/C++)
5 TiCNN 74.99 % 81.90 % 65.40 % 0.5 s GPU @ 2.5 Ghz (Matlab + C/C++)
6 MS-CNN code 74.45 % 82.34 % 64.91 % 0.4 s GPU @ 2.5 Ghz (C/C++)
Z. Cai, Q. Fan, R. Feris and N. Vasconcelos: A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. ECCV 2016.
7 TuSimple 74.26 % 81.38 % 64.88 % 1.6 s GPU @ 2.5 Ghz (Python + C/C++)
8 Allspark 74.25 % 81.77 % 65.23 % 0.7 s GPU @ 2.5 Ghz (C/C++)
9 Deep3DBox 73.48 % 82.65 % 64.11 % 1.5 s GPU @ 2.5 Ghz (C/C++)
A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.
10 SDP+RPN 73.08 % 81.05 % 64.88 % 0.4 s GPU @ 2.5 Ghz (Python + C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 2015.
11 sensekitti 72.50 % 81.76 % 64.00 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
12 SubCNN 70.77 % 77.82 % 62.71 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
13 uickitti 70.72 % 77.57 % 62.23 % 1.5 s GPU @ 2.5 Ghz (C/C++)
14 DJML 70.32 % 78.76 % 61.89 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
15 3DOP
This method uses stereo information.
code 68.81 % 80.17 % 61.36 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
16 Re-3DOP 68.44 % 78.46 % 60.80 % 3 s 1 core @ 2.5 Ghz (C/C++)
17 Pose-RCNN 68.04 % 80.19 % 59.95 % 2 s >8 cores @ 2.5 Ghz (Python)
18 Vote3Deep
This method makes use of Velodyne laser scans.
67.96 % 76.49 % 62.88 % 1.5 s 4 cores @ 2.5 Ghz (C/C++)
M. Engelcke, D. Rao, D. Zeng Wang, C. Hay Tong and I. Posner: Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. ArXiv e-prints 2016.
19 IVA code 67.36 % 77.63 % 59.62 % 0.4 s GPU @ 2.5 Ghz (C/C++)
Y. Zhu, J. Wang, C. Zhao, H. Guo and H. Lu: Scale-adaptive Deconvolutional Regression Network for Pedestrian Detection. ACCV 2016.
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 2015.
20 DeepStereoOP 65.72 % 77.00 % 57.74 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
21 HSR2 64.94 % 76.36 % 57.62 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
22 HM_SSD_RCNN 64.67 % 77.55 % 54.70 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
23 HM3D 63.89 % 76.28 % 56.51 % 0.35 s GPU @ >3.5 Ghz (C/C++)
24 Mono3D code 63.85 % 75.22 % 58.96 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
25 tbd 63.48 % 75.49 % 55.88 % 1 s 1 core @ 2.5 Ghz (C/C++)
26 WRInception 62.85 % 78.19 % 55.64 % 0.06 s GPU @ 2.5 Ghz (C/C++)
27 Faster R-CNN code 62.81 % 71.41 % 55.44 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks. NIPS 2015.
28 IVA code 60.99 % 67.88 % 54.34 % 1 s GPU @ 2.5 Ghz (Matlab + C/C++)
29 SDP+CRC (ft) 60.87 % 74.31 % 53.95 % 0.6 s GPU @ 2.5 Ghz (C/C++)
F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.
30 PNET 58.70 % 73.97 % 51.63 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
31 Regionlets 58.69 % 70.09 % 51.81 % 1 s >8 cores @ 2.5 Ghz (C/C++)
X. Wang, M. Yang, S. Zhu and Y. Lin: Regionlets for Generic Object Detection. T-PAMI 2015.
W. Zou, X. Wang, M. Sun and Y. Lin: Generic Object Detection with Dense Neural Patterns and Regionlets. British Machine Vision Conference 2014.
C. Long, X. Wang, G. Hua, M. Yang and Y. Lin: Accurate Object Detection with Location Relaxation and Regionlets Relocalization. Asian Conference on Computer Vision 2014.
32 FRCNN+Or 55.80 % 68.81 % 50.52 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
33 ANM 53.04 % 71.56 % 46.38 % 0.05 s GPU @ 2.5 Ghz (C/C++)
34 ANM 52.95 % 69.91 % 46.80 % 0.05 s GPU @ 2.5 Ghz (C/C++)
35 LPN 50.02 % 65.33 % 44.85 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
36 ZGC 48.06 % 64.87 % 40.74 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
37 FD2 44.29 % 62.32 % 40.65 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
38 maxFtr+ROI 43.59 % 49.65 % 38.74 % 0.25 s 4 cores @ 2.5 Ghz (C/C++)
W. Tian and M. Lauer: Detection and Orientation Estimation for Cyclists by Max Pooled Features. International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP) 2017.
39 HM 42.99 % 60.32 % 41.45 % 1 s 1 core @ 2.5 Ghz (C/C++)
40 FRO 42.98 % 59.96 % 38.97 % 0.19 s GPU @ 2.5 Ghz (Python)
41 GNN 42.65 % 59.43 % 37.72 % 0.2 s 1 core @ 2.5 Ghz (Python)
42 MV-RGBD-RF
This method makes use of Velodyne laser scans.
42.61 % 51.46 % 37.42 % 4 s 4 cores @ 2.5 Ghz (C/C++)
A. Gonzalez, D. Vazquez, A. Lopez and J. Amores: On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts.. IEEE Trans. on Cybernetics 2016.
A. Gonzalez, G. Villalonga, J. Xu, D. Vazquez, J. Amores and A. Lopez: Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. IEEE Intelligent Vehicles Symposium (IV) 2015.
43 QHY 42.30 % 59.30 % 41.29 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
44 NMF-CNN 42.13 % 56.30 % 37.46 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
45 AR-FCN 41.83 % 51.05 % 33.99 % 0.19 s GPU @ 2.5 Ghz (C/C++)
46 RCNN 40.38 % 50.77 % 33.07 % 0.08 s GPU @ 2.5 Ghz (Python + C/C++)
47 YOLOv2 code 39.96 % 56.59 % 33.06 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
48 HL 39.10 % 55.19 % 32.66 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
49 BNet 38.07 % 54.91 % 30.98 % 0.03 s 1 core @ 2.5 Ghz (C/C++)
50 pAUCEnsT 37.88 % 52.28 % 33.38 % 60 s 1 core @ 2.5 Ghz (Matlab + C/C++)
S. Paisitkriangkrai, C. Shen and A. Hengel: Pedestrian Detection with Spatially Pooled Features and Structured Ensemble Learning. arXiv 2014.
51 GNN 37.64 % 54.47 % 35.09 % 0.2 s 1 core @ 2.5 Ghz (Python)
52 FD 37.01 % 51.41 % 32.93 % 0.01 s GPU @ >3.5 Ghz (Python)
53 NMRDO 33.43 % 46.39 % 27.79 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
54 Vote3D
This method makes use of Velodyne laser scans.
31.24 % 41.45 % 28.60 % 0.5 s 4 cores @ 2.8 Ghz (C/C++)
D. Wang and I. Posner: Voting for Voting in Online Point Cloud Object Detection. Proceedings of Robotics: Science and Systems 2015.
55 DPM-VOC+VP 31.16 % 43.65 % 28.29 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
56 LSVM-MDPM-us code 30.81 % 40.31 % 28.17 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
57 ACF_M 29.24 % 37.71 % 27.52 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
58 LSVM-MDPM-sv 29.24 % 37.71 % 27.52 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
59 DPM-C8B1
This method uses stereo information.
29.04 % 43.28 % 26.20 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
60 R-CNN_VGG 28.79 % 37.71 % 25.82 % 10 s GPU @ 2.5 Ghz (Matlab + C/C++)
61 mBoW
This method makes use of Velodyne laser scans.
21.62 % 28.19 % 20.93 % 10 s 1 core @ 2.5 Ghz (C/C++)
J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.
62 YOLO 13.96 % 18.07 % 13.83 % 0.03 s GPU @ 1.0 Ghz (C/C++)
63 YOLOv2 code 4.55 % 4.55 % 4.55 % 0.02 s GPU @ 3.5 Ghz (C/C++)
J. Redmon, S. Divvala, R. Girshick and A. Farhadi: You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016.
J. Redmon and A. Farhadi: YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
Table as LaTeX | Only published Methods

Object Detection and Orientation Estimation Evaluation

Cars


Method Setting Code Moderate Easy Hard Runtime Environment
1 Deep MANTA 89.86 % 97.19 % 80.39 % 0.7 s GPU @ 2.5 Ghz (Python + C/C++)
F. Chabot, M. Chaouch, J. Rabarisoa, C. Teulière and T. Chateau: Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image. CVPR 2017.
2 RaC 89.25 % 89.98 % 80.07 % 1s s GPU @ 1.0 Ghz (C/C++)
3 uickitti 88.72 % 90.67 % 78.95 % 1.5 s GPU @ 2.5 Ghz (C/C++)
4 Deep3DBox 88.56 % 90.39 % 77.17 % 1.5 s GPU @ 2.5 Ghz (C/C++)
A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.
5 SubCNN 88.43 % 90.61 % 78.63 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
6 DJML 88.27 % 89.90 % 78.29 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
7 HM3D 87.29 % 89.41 % 77.08 % 0.35 s GPU @ >3.5 Ghz (C/C++)
8 DeepStereoOP 86.57 % 89.01 % 77.13 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
9 Mono3D code 85.83 % 89.00 % 76.00 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
10 3DOP
This method uses stereo information.
code 85.81 % 88.56 % 76.21 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
11 FRCNN+Or 77.80 % 88.93 % 67.87 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
12 3D FCN
This method makes use of Velodyne laser scans.
75.71 % 85.46 % 68.19 % >5 s 1 core @ 2.5 Ghz (C/C++)
B. Li: 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. IROS 2017.
13 Pose-RCNN 75.35 % 88.78 % 61.47 % 2 s >8 cores @ 2.5 Ghz (Python)
14 3DVP code 74.59 % 81.02 % 64.11 % 40 s 8 cores @ 3.5 Ghz (Matlab + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Data-Driven 3D Voxel Patterns for Object Category Recognition. IEEE Conference on Computer Vision and Pattern Recognition 2015.
15 SubCat code 74.42 % 80.74 % 58.83 % 0.7 s 6 cores @ 3.5 Ghz (Matlab + C/C++)
E. Ohn-Bar and M. Trivedi: Learning to Detect Vehicles by Clustering Appearance Patterns. T-ITS 2015.
16 BdCost48LDCF 66.00 % 77.10 % 50.35 % 5 s 1 core @ 2.5 Ghz (C/C++)
17 BdCost48-25C 65.25 % 77.59 % 50.68 % 4 s 1 core @ 2.5 Ghz (C/C++)
18 OC-DPM 64.88 % 74.66 % 52.24 % 10 s 8 cores @ 2.5 Ghz (Matlab)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Occlusion Patterns for Object Class Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013.
19 DPM-VOC+VP 63.27 % 77.51 % 47.57 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
20 AOG-View 62.25 % 77.37 % 50.44 % 3 s 1 core @ 2.5 Ghz (Matlab, C/C++)
B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.
21 NMRDO 59.55 % 77.38 % 51.91 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
22 LSVM-MDPM-sv 56.69 % 70.86 % 45.91 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
23 GVPL 54.32 % 61.33 % 46.24 % 1 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
24 VeloFCN
This method makes use of Velodyne laser scans.
52.70 % 70.21 % 46.11 % 1 s GPU @ 2.5 Ghz (Python + C/C++)
B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .
25 DPM-C8B1
This method uses stereo information.
50.32 % 59.53 % 39.22 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
26 HSR2 45.46 % 47.03 % 40.60 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
27 Allspark 45.30 % 47.19 % 39.95 % 0.7 s GPU @ 2.5 Ghz (C/C++)
28 WRInception 45.07 % 47.05 % 40.52 % 0.06 s GPU @ 2.5 Ghz (C/C++)
29 sensekitti 44.56 % 47.06 % 41.50 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
30 HM_SSD_RCNN 44.44 % 46.40 % 39.98 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
31 VCTNet 42.05 % 45.66 % 37.49 % 0.18 s GPU @ 3.5 GHz (C/C++)
32 DuEye 40.99 % 39.62 % 38.97 % 4 s GPU @ 2.5 Ghz (C/C++)
33 FD 40.40 % 46.30 % 34.01 % 0.01 s GPU @ >3.5 Ghz (Python)
34 FD2 39.44 % 47.56 % 35.20 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
35 CPCD 38.93 % 36.51 % 34.15 % 3 s 1 core @ 2.5 Ghz (C/C++)
36 Re-3DOP 38.35 % 36.67 % 33.74 % 3 s 1 core @ 2.5 Ghz (C/C++)
37 UI 38.14 % 39.13 % 31.41 % 0.4 s GPU @ 2.5 Ghz (C/C++)
38 Direwolf 36.92 % 37.35 % 33.37 % 0.5 s GPU @ 2.5 Ghz (C/C++)
39 ZGC 36.69 % 45.54 % 32.23 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
40 QHY 36.31 % 46.05 % 31.72 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
41 SYVO 36.17 % 36.85 % 29.14 % 0.13 s GPU @ 2.5 Ghz (C/C++)
42 HL 35.06 % 41.56 % 27.94 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
43 ANM 34.79 % 35.30 % 31.75 % 0.05 s GPU @ 2.5 Ghz (C/C++)
44 ANM 32.72 % 34.26 % 28.06 % 0.05 s GPU @ 2.5 Ghz (C/C++)
45 LPN 32.41 % 33.97 % 29.15 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
46 SceneNet 32.02 % 36.62 % 28.46 % 0.03 s GPU @ 2.5 Ghz (C/C++)
47 AOG code 30.81 % 34.05 % 24.86 % 3 s 4 cores @ 2.5 Ghz (Matlab)
T. Wu, B. Li and S. Zhu: Learning And-Or Models to Represent Context and Occlusion for Car Detection and Viewpoint Estimation. TPAMI 2016.
B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.
48 FCNN 28.85 % 35.35 % 25.25 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
49 YOLOv2 code 26.98 % 34.61 % 23.42 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
50 NMF-CNN 26.11 % 32.01 % 19.11 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
51 CSoR
This method makes use of Velodyne laser scans.
code 25.38 % 34.43 % 21.95 % 3.5 s 4 cores @ >3.5 Ghz (Python + C/C++)
L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.
52 SubCat48LDCF 24.27 % 28.56 % 19.02 % 5 s 1 core @ 2.5 Ghz (Matlab + C/C++)
53 frd 22.18 % 27.99 % 19.59 % 2 s 1 core @ 2.5 Ghz (C/C++)
Table as LaTeX | Only published Methods

Pedestrians


Method Setting Code Moderate Easy Hard Runtime Environment
1 uickitti 66.83 % 78.89 % 62.06 % 1.5 s GPU @ 2.5 Ghz (C/C++)
2 SubCNN 66.28 % 78.33 % 61.37 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
3 Pose-RCNN 59.89 % 74.10 % 54.21 % 2 s >8 cores @ 2.5 Ghz (Python)
4 3DOP
This method uses stereo information.
code 59.79 % 73.46 % 57.04 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
5 DeepStereoOP 59.28 % 73.37 % 56.87 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
6 HM3D 58.21 % 70.22 % 53.72 % 0.35 s GPU @ >3.5 Ghz (C/C++)
7 DJML 58.13 % 69.87 % 52.62 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
8 Mono3D code 58.12 % 68.58 % 54.94 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
9 FRCNN+Or 52.96 % 67.92 % 49.61 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
10 DPM-VOC+VP 39.83 % 53.66 % 35.73 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
11 Allspark 38.98 % 43.54 % 36.22 % 0.7 s GPU @ 2.5 Ghz (C/C++)
12 sensekitti 37.50 % 43.55 % 35.08 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
13 VCTNet 36.57 % 40.77 % 33.63 % 0.18 s GPU @ 3.5 GHz (C/C++)
14 Re-3DOP 36.27 % 44.80 % 34.34 % 3 s 1 core @ 2.5 Ghz (C/C++)
15 ACF_M 35.49 % 47.00 % 32.42 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
16 LSVM-MDPM-sv 35.49 % 47.00 % 32.42 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
17 WRInception 35.14 % 40.34 % 32.50 % 0.06 s GPU @ 2.5 Ghz (C/C++)
18 HM_SSD_RCNN 34.38 % 42.11 % 30.73 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
19 SubCat 34.18 % 43.95 % 30.76 % 1.2 s 6 cores @ 2.5 Ghz (Matlab + C/C++)
E. Ohn-Bar and M. Trivedi: Fast and Robust Object Detection Using Visual Subcategories. Computer Vision and Pattern Recognition Workshops Mobile Vision 2014.
20 HSR2 33.86 % 39.97 % 32.48 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
21 Tx 33.84 % 39.58 % 30.96 % 2 s GPU @ 2.5 Ghz (Matlab + C/C++)
22 RB 33.70 % 43.29 % 30.29 % 0.6 s GPU @ 2.5 Ghz (Matlab + C/C++)
23 NMRDO 33.06 % 44.95 % 31.83 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
24 SSD1 32.73 % 41.73 % 30.69 % 0.255 s GPU @ 2.5 Ghz (python+ C/C++)
25 RPN+BF code 32.55 % 40.97 % 29.52 % 0.6 s GPU @ 2.5 Ghz (Matlab + C/C++)
L. Zhang, L. Lin, X. Liang and K. He: Is Faster R-CNN Doing Well for Pedestrian Detection?. ECCV 2016.
26 LPN 31.63 % 38.40 % 28.90 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
27 NMF-CNN 30.94 % 40.14 % 28.58 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
28 ANM 30.04 % 39.60 % 27.56 % 0.05 s GPU @ 2.5 Ghz (C/C++)
29 FD2 28.59 % 35.53 % 26.02 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
30 ANM 28.47 % 38.07 % 27.69 % 0.05 s GPU @ 2.5 Ghz (C/C++)
31 ACF 28.46 % 35.69 % 26.18 % 1 s 1 core @ 3.5 Ghz (Matlab + C/C++)
P. Doll\'ar, R. Appel, S. Belongie and P. Perona: Fast Feature Pyramids for Object Detection. PAMI 2014.
32 FD 27.90 % 33.68 % 25.17 % 0.01 s GPU @ >3.5 Ghz (Python)
33 YOLOv2 code 27.35 % 32.98 % 22.99 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
34 ZGC 26.42 % 34.65 % 22.57 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
35 HL 24.21 % 32.32 % 20.43 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
36 DPM-C8B1
This method uses stereo information.
23.37 % 31.08 % 20.72 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
37 ACF-MR 23.18 % 29.35 % 21.00 % 0.6 s 1 core @ 3.5 Ghz (C/C++)
R. Rajaram, E. Ohn-Bar and M. Trivedi: Looking at Pedestrians at Different Scales: A Multi-resolution Approach and Evaluations. T-ITS 2016.
38 QHY 21.79 % 30.60 % 21.41 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
Table as LaTeX | Only published Methods

Cyclists


Method Setting Code Moderate Easy Hard Runtime Environment
1 uickitti 63.59 % 70.70 % 56.15 % 1.5 s GPU @ 2.5 Ghz (C/C++)
2 SubCNN 63.41 % 71.39 % 56.34 % 2 s GPU @ 3.5 Ghz (Python + C/C++)
Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.
3 Pose-RCNN 62.25 % 74.85 % 55.09 % 2 s >8 cores @ 2.5 Ghz (Python)
4 DJML 59.76 % 69.26 % 52.94 % 2.4 s GPU @ 2.5 Ghz (Python + C/C++)
5 Deep3DBox 59.37 % 68.58 % 51.97 % 1.5 s GPU @ 2.5 Ghz (C/C++)
A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.
6 3DOP
This method uses stereo information.
code 58.59 % 71.95 % 52.35 % 3s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.
7 DeepStereoOP 55.62 % 67.49 % 48.85 % 3.4 s GPU @ 3.5 Ghz (Matlab + C/C++)
C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.
8 HM3D 55.12 % 67.32 % 48.86 % 0.35 s GPU @ >3.5 Ghz (C/C++)
9 Mono3D code 53.11 % 65.74 % 48.87 % 4.2 s GPU @ 2.5 Ghz (Matlab + C/C++)
X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.
10 FRCNN+Or 51.47 % 64.90 % 46.48 % 0.1 s GPU @ 1.5 Ghz (Python + C/C++)
C. Guindel, D. Martin and J. Armingol: Joint Object Detection and Viewpoint Estimation using CNN features. IEEE International Conference on Vehicular Electronics and Safety (ICVES) 2017.
11 VCTNet 43.79 % 48.73 % 38.05 % 0.18 s GPU @ 3.5 GHz (C/C++)
12 Allspark 42.27 % 48.01 % 36.27 % 0.7 s GPU @ 2.5 Ghz (C/C++)
13 sensekitti 42.12 % 46.65 % 36.66 % 4.5 s GPU @ 2.5 Ghz (Python + C/C++)
14 maxFtr+ROI 38.29 % 42.96 % 34.28 % 0.25 s 4 cores @ 2.5 Ghz (C/C++)
W. Tian and M. Lauer: Detection and Orientation Estimation for Cyclists by Max Pooled Features. International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP) 2017.
15 HSR2 36.82 % 42.76 % 32.33 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
16 HM_SSD_RCNN 35.64 % 44.70 % 29.65 % 0.15 s 1 core @ 2.5 Ghz (C/C++)
17 WRInception 34.02 % 41.88 % 29.37 % 0.06 s GPU @ 2.5 Ghz (C/C++)
18 Re-3DOP 29.69 % 31.49 % 27.42 % 3 s 1 core @ 2.5 Ghz (C/C++)
19 LPN 27.01 % 32.96 % 25.01 % 0.2 s GPU @ 2.5 Ghz (Python + C/C++)
20 ZGC 26.53 % 36.55 % 22.25 % 0.12 s 1 core @ 2.5 Ghz (C/C++)
21 FD2 24.65 % 35.58 % 21.97 % 0.01 s GPU @ >3.5 Ghz (Python + C/C++)
22 ANM 24.05 % 31.01 % 21.12 % 0.05 s GPU @ 2.5 Ghz (C/C++)
23 QHY 23.90 % 33.67 % 22.75 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
24 NMRDO 23.53 % 32.68 % 19.81 % 0.1 s GPU @ 2.5 Ghz (Python + C/C++)
25 DPM-VOC+VP 23.22 % 31.24 % 21.62 % 8 s 1 core @ 2.5 Ghz (C/C++)
B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.
26 ACF_M 23.14 % 28.89 % 22.28 % 0.1 s 1 core @ 2.5 Ghz (C/C++)
27 LSVM-MDPM-sv 23.14 % 28.89 % 22.28 % 10 s 4 cores @ 3.0 Ghz (C/C++)
P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.
A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.
28 ANM 22.82 % 30.83 % 20.15 % 0.05 s GPU @ 2.5 Ghz (C/C++)
29 YOLOv2 code 22.36 % 28.97 % 19.45 % 0.03 s GPU @ 2.0 Ghz (Python + C/C++)
30 FD 21.60 % 30.76 % 18.56 % 0.01 s GPU @ >3.5 Ghz (Python)
31 HL 21.41 % 30.22 % 17.64 % 0.16 s 1 core @ 2.5 Ghz (C/C++)
32 DPM-C8B1
This method uses stereo information.
19.25 % 27.16 % 17.95 % 15 s 4 cores @ 2.5 Ghz (Matlab + C/C++)
J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.
J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.
33 NMF-CNN 16.78 % 22.03 % 15.10 % 0.1 s GPU @ 2.5 Ghz (Matlab + C/C++)
Table as LaTeX | Only published Methods

Related Datasets

Citation

When using this dataset in your research, we will be happy if you cite us:
@INPROCEEDINGS{Geiger2012CVPR,
  author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
  title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2012}
}



eXTReMe Tracker