\begin{tabular}{c | c | c | c | c | c | c}
{\bf Method} & {\bf Setting} & {\bf Moderate} & {\bf Easy} & {\bf Hard} & {\bf Runtime} & {\bf Environment}\\ \hline
Genome & & 90.43 \% & 90.85 \% & 81.97 \% & 4 s / GPU & \\
BM-NET & & 90.41 \% & 90.79 \% & 80.49 \% & 0.5 s / GPU & \\
SAIT & & 90.36 \% & 90.78 \% & 80.48 \% & 0.15 s / GPU & \\
TuSimple & & 90.33 \% & 90.77 \% & 82.86 \% & 1.6 s / GPU & \\
eagle & & 90.25 \% & 90.77 \% & 85.20 \% & 4 s / GPU & \\
RRC & & 90.22 \% & 90.61 \% & 87.44 \% & 3.6 s / GPU & J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y. Tai and L. Xu: Accurate Single Stage Detector Using Recurrent Rolling Convolution. CVPR 2017.\\
RV-CNN & & 90.14 \% & 90.76 \% & 84.96 \% & 3.5 s / GPU & \\
DuEye & & 90.11 \% & 90.65 \% & 85.46 \% & 4 s / GPU & \\
Direwolf & & 90.04 \% & 90.71 \% & 80.63 \% & 0.5 s / GPU & \\
Deep MANTA & & 90.03 \% & 97.25 \% & 80.62 \% & 0.7 s / GPU & F. Chabot, M. Chaouch, J. Rabarisoa, C. Teulière and T. Chateau: Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image. CVPR 2017.\\
sensekitti & & 90.00 \% & 90.76 \% & 81.83 \% & 4.5 s / GPU & \\
NVDriveNet-H & & 89.81 \% & 90.09 \% & 80.08 \% & 0.15s / GPU & \\
Allspark & & 89.73 \% & 90.64 \% & 79.16 \% & 0.7 s / GPU & \\
SINet+ & & 89.63 \% & 90.55 \% & 77.64 \% & 0.3 s / GPU & \\
SDP+RPN & & 89.42 \% & 89.90 \% & 78.54 \% & 0.4 s / GPU & F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 2015.\\
DJML & & 89.40 \% & 90.51 \% & 79.58 \% & 2.4 s / GPU & \\
SINet\_VGG & & 89.32 \% & 90.60 \% & 77.85 \% & 0.2 s / GPU & \\
Pie & & 89.23 \% & 90.19 \% & 77.98 \% & 1.2 s / 1 core & \\
uickitti & & 89.17 \% & 90.80 \% & 79.58 \% & 1.5 s / GPU & \\
HSR2 & & 88.98 \% & 90.76 \% & 78.62 \% & 0.15 s / 1 core & \\
SubCNN & & 88.86 \% & 90.75 \% & 79.24 \% & 2 s / GPU & Y. Xiang, W. Choi, Y. Lin and S. Savarese: Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection. IEEE Winter Conference on Applications of Computer Vision (WACV) 2017.\\
Deep3DBox & & 88.86 \% & 90.47 \% & 77.60 \% & 1.5 s / GPU & A. Mousavian, D. Anguelov, J. Flynn and J. Kosecka: 3D Bounding Box Estimation Using Deep Learning and Geometry. CVPR 2017.\\
MS-CNN & & 88.83 \% & 90.46 \% & 74.76 \% & 0.4 s / GPU & Z. Cai, Q. Fan, R. Feris and N. Vasconcelos: A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. ECCV 2016.\\
DeepStereoOP & & 88.75 \% & 90.34 \% & 79.39 \% & 3.4 s / GPU & C. Pham and J. Jeon: Robust Object Proposals Re-ranking for Object Detection in Autonomous Driving Using Convolutional Neural Networks. Signal Processing: Image Communiation 2017.\\
SINet\_PVA & & 88.73 \% & 90.26 \% & 75.28 \% & 0.11 s / GPU & \\
CPCD & & 88.73 \% & 90.28 \% & 79.20 \% & 3 s / 1 core & \\
HM\_SSD\_RCNN & & 88.69 \% & 90.47 \% & 77.86 \% & 0.15 s / 1 core & \\
Re-3DOP & & 88.46 \% & 90.27 \% & 78.93 \% & 3 s / 1 core & \\
3DOP & st & 88.34 \% & 90.09 \% & 78.79 \% & 3s / GPU & X. Chen, K. Kundu, Y. Zhu, A. Berneshawi, H. Ma, S. Fidler and R. Urtasun: 3D Object Proposals for Accurate Object Class Detection. NIPS 2015.\\
MM-MRFC & fl la & 88.20 \% & 90.93 \% & 78.02 \% & 0.05 s / GPU & \\
SYVO & & 88.14 \% & 89.60 \% & 71.05 \% & 0.13 s / GPU & \\
Mono3D & & 87.86 \% & 90.27 \% & 78.09 \% & 4.2 s / GPU & X. Chen, K. Kundu, Z. Zhang, H. Ma, S. Fidler and R. Urtasun: Monocular 3D Object Detection for Autonomous Driving. CVPR 2016.\\
WRInception & & 87.62 \% & 88.98 \% & 77.52 \% & 0.06 s / GPU & \\
CNN based & & 87.42 \% & 88.76 \% & 77.55 \% & 1s / 1 core & \\
UI & & 87.34 \% & 89.56 \% & 71.16 \% & 0.4 s / GPU & \\
tbd & & 86.97 \% & 90.10 \% & 77.94 \% & 1 s / 1 core & \\
TWSNet & & 86.30 \% & 90.03 \% & 71.36 \% & 0.48 s / GPU & \\
ANM & & 85.61 \% & 86.80 \% & 77.20 \% & 0.05 s / GPU & \\
SDP+CRC (ft) & & 81.33 \% & 90.39 \% & 70.33 \% & 0.6 s / GPU & F. Yang, W. Choi and Y. Lin: Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition 2016.\\
ANM & & 81.29 \% & 85.23 \% & 69.32 \% & 0.05 s / GPU & \\
MV3D & la & 80.56 \% & 89.67 \% & 79.43 \% & 0.36 s / GPU & X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.\\
RefineNet & & 79.21 \% & 90.16 \% & 65.71 \% & 0.20 s / GPU & R. Rajaram, E. Bar and M. Trivedi: RefineNet: Iterative Refinement for Accurate Object Localization. Intelligent Transportation Systems Conference 2016.\\
MV3D (LIDAR) & la & 79.17 \% & 89.01 \% & 78.09 \% & 0.24 s / GPU & X. Chen, H. Ma, J. Wan, B. Li and T. Xia: Multi-View 3D Object Detection Network for Autonomous Driving. CVPR 2017.\\
Faster R-CNN & & 79.11 \% & 87.90 \% & 70.19 \% & 2 s / GPU & S. Ren, K. He, R. Girshick and J. Sun: Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks. NIPS 2015.\\
PNET & & 78.98 \% & 85.21 \% & 71.20 \% & 0.1 s / GPU & \\
spLBP & & 77.39 \% & 80.16 \% & 60.59 \% & 1.5 s / 8 cores & Q. Hu, S. Paisitkriangkrai, C. Shen, A. Hengel and F. Porikli: Fast Detection of Multiple Objects in Traffic Scenes With a Common Detection Framework. IEEE Trans. Intelligent Transportation Systems 2016.\\
SceneNet & & 77.34 \% & 87.90 \% & 68.38 \% & 0.03 s / GPU & \\
Reinspect & & 76.65 \% & 88.36 \% & 66.56 \% & 2s / 1 core & R. Stewart, M. Andriluka and A. Ng: End-to-End People Detection in Crowded Scenes. CVPR 2016.\\
Regionlets & & 76.56 \% & 86.50 \% & 59.82 \% & 1 s / >8 cores & X. Wang, M. Yang, S. Zhu and Y. Lin: Regionlets for Generic Object Detection. T-PAMI 2015.W. Zou, X. Wang, M. Sun and Y. Lin: Generic Object Detection with Dense Neural Patterns and Regionlets. British Machine Vision Conference 2014.C. Long, X. Wang, G. Hua, M. Yang and Y. Lin: Accurate Object Detection with Location Relaxation and Regionlets Relocalization. Asian Conference on Computer Vision 2014.\\
AOG & & 75.97 \% & 85.58 \% & 60.96 \% & 3 s / 4 cores & T. Wu, B. Li and S. Zhu: Learning And-Or Models to Represent Context and Occlusion for Car Detection and Viewpoint Estimation. TPAMI 2016.B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.\\
XXX & la & 75.83 \% & 85.54 \% & 68.30 \% & >5 s / 1 core & \\
3DVP & & 75.77 \% & 81.46 \% & 65.38 \% & 40 s / 8 cores & Y. Xiang, W. Choi, Y. Lin and S. Savarese: Data-Driven 3D Voxel Patterns for Object Category Recognition. IEEE Conference on Computer Vision and Pattern Recognition 2015.\\
Pose-RCNN & & 75.74 \% & 88.89 \% & 61.86 \% & 2 s / >8 cores & \\
AR-FCN & & 75.49 \% & 81.24 \% & 66.00 \% & 0.19 s / GPU & \\
SubCat & & 75.46 \% & 81.45 \% & 59.71 \% & 0.7 s / 6 cores & E. Ohn-Bar and M. Trivedi: Learning to Detect Vehicles by Clustering Appearance Patterns. T-ITS 2015.\\
FD2 & & 74.68 \% & 87.14 \% & 65.70 \% & 0.01 s / GPU & \\
FD & & 72.64 \% & 82.34 \% & 60.31 \% & 0.01 s / GPU & \\
SmartCNN & & 71.10 \% & 77.00 \% & 56.97 \% & 1 s / 1 core & \\
FCNN & & 70.67 \% & 88.04 \% & 61.50 \% & 0.1 s / 1 core & \\
MV-RGBD-RF & la & 69.92 \% & 76.49 \% & 57.47 \% & 4 s / 4 cores & A. Gonzalez, D. Vazquez, A. Lopez and J. Amores: On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts.. IEEE Trans. on Cybernetics 2016.A. Gonzalez, G. Villalonga, J. Xu, D. Vazquez, J. Amores and A. Lopez: Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection. IEEE Intelligent Vehicles Symposium (IV) 2015.\\
AOG-View & & 69.89 \% & 84.29 \% & 57.25 \% & 3 s / 1 core & B. Li, T. Wu and S. Zhu: Integrating Context and Occlusion for Car Detection by Hierarchical And-Or Model. ECCV 2014.\\
Vote3Deep & la & 68.39 \% & 76.95 \% & 63.22 \% & 1.5 s / 4 cores & M. Engelcke, D. Rao, D. Zeng Wang, C. Hay Tong and I. Posner: Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. ArXiv e-prints 2016.\\
ZGC & & 68.02 \% & 85.19 \% & 58.88 \% & 0.12 s / 1 core & \\
GPVL & & 67.89 \% & 77.76 \% & 58.23 \% & 10 s / 1 core & \\
GVPL & & 67.78 \% & 77.74 \% & 57.69 \% & 1 s / 8 cores & \\
QHY & & 67.69 \% & 85.23 \% & 58.64 \% & 0.1 s / 1 core & \\
BdCost48LDCF & & 67.07 \% & 77.93 \% & 51.15 \% & 5 s / 1 core & \\
OC-DPM & & 66.45 \% & 76.16 \% & 53.70 \% & 10 s / 8 cores & B. Pepik, M. Stark, P. Gehler and B. Schiele: Occlusion Patterns for Object Class Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013.\\
DPM-VOC+VP & & 66.25 \% & 80.45 \% & 49.86 \% & 8 s / 1 core & B. Pepik, M. Stark, P. Gehler and B. Schiele: Multi-view and 3D Deformable Part Models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2015.\\
BdCost48-25C & & 65.95 \% & 78.21 \% & 51.23 \% & 4 s / 1 core & \\
RCNN & & 65.94 \% & 84.47 \% & 51.00 \% & 0.08 s / GPU & \\
HL & & 64.94 \% & 77.55 \% & 50.53 \% & 0.16 s / 1 core & \\
LCNN & & 64.74 \% & 79.18 \% & 56.79 \% & 1 s / 1 core & \\
MDPM-un-BB & & 64.20 \% & 77.32 \% & 50.18 \% & 60 s / 4 core & P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.\\
NMF-CNN & & 62.21 \% & 77.32 \% & 49.29 \% & 0.1 s / GPU & \\
NMRDO & & 61.72 \% & 79.48 \% & 54.06 \% & 0.1 s / GPU & \\
SubCat48LDCF & & 61.57 \% & 74.14 \% & 48.18 \% & 5 s / 1 core & \\
DPM-C8B1 & st & 60.99 \% & 74.95 \% & 47.16 \% & 15 s / 4 cores & J. Yebes, L. Bergasa and M. García-Garrido: Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes. Sensors 2015.J. Yebes, L. Bergasa, R. Arroyo and A. Lázaro: Supervised learning and evaluation of KITTI's cars detector with DPM. IV 2014.\\
HgCNN & & 60.72 \% & 73.77 \% & 52.95 \% & 1 s / 1 core & \\
LSVM-MDPM-sv & & 57.44 \% & 71.70 \% & 46.58 \% & 10 s / 4 cores & P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.A. Geiger, C. Wojek and R. Urtasun: Joint 3D Estimation of Objects and Scene Layout. NIPS 2011.\\
Faster RCNN & & 56.58 \% & 62.31 \% & 45.27 \% & 0.11 s / GPU & \\
LSVM-MDPM-us & & 56.10 \% & 70.52 \% & 42.87 \% & 10 s / 4 cores & P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan: Object Detection with Discriminatively Trained Part-Based Models. PAMI 2010.\\
ACF-SC & & 55.76 \% & 69.76 \% & 46.27 \% & & C. Cadena, A. Dick and I. Reid: A Fast, Modular Scene Understanding System using Context-Aware Object Detection. Robotics and Automation (ICRA), 2015 IEEE International Conference on 2015.\\
frd & & 55.70 \% & 69.86 \% & 48.53 \% & 2 s / 1 core & \\
MLSmoke & & 53.54 \% & 77.41 \% & 44.63 \% & 1 s / 1 core & \\
VeloFCN & la & 53.45 \% & 70.68 \% & 46.90 \% & 1 s / GPU & B. Li, T. Zhang and T. Xia: Vehicle Detection from 3D Lidar Using Fully Convolutional Network. RSS 2016 .\\
ACF & & 52.81 \% & 62.82 \% & 43.89 \% & 0.2 s / 1 core & P. Doll\'ar, R. Appel, S. Belongie and P. Perona: Fast Feature Pyramids for Object Detection. PAMI 2014.P. Doll\'ar: Piotr's Image and Video Matlab Toolbox (PMT). .\\
Vote3D & la & 48.05 \% & 56.66 \% & 42.64 \% & 0.5 s / 4 cores & D. Wang and I. Posner: Voting for Voting in Online Point Cloud Object Detection. Proceedings of Robotics: Science and Systems 2015.\\
YOLO & & 35.86 \% & 49.47 \% & 29.74 \% & 0.03 s / GPU & \\
CSoR & la & 26.13 \% & 35.24 \% & 22.69 \% & 3.5 s / 4 cores & L. Plotkin: PyDriver: Entwicklung eines Frameworks für räumliche Detektion und Klassifikation von Objekten in Fahrzeugumgebung. 2015.\\
R-CNN\_VGG & & 26.04 \% & 32.23 \% & 20.93 \% & 10 s / GPU & \\
mBoW & la & 23.76 \% & 37.63 \% & 18.44 \% & 10 s / 1 core & J. Behley, V. Steinhage and A. Cremers: Laser-based Segment Classification Using a Mixture of Bag-of-Words. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013.\\
YOLOv2 & & 19.31 \% & 28.37 \% & 15.94 \% & 0.02 s / GPU & J. Redmon, S. Divvala, R. Girshick and A. Farhadi: You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016.J. Redmon and A. Farhadi: YOLO9000: Better, Faster, Stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
\end{tabular}