Object Tracking Evaluation 2012


The object tracking benchmark consists of 21 training sequences and 29 test sequences. Despite the fact that we have labeled 8 different classes, only the classes 'Car' and 'Pedestrian' are evaluated in our benchmark, as only for those classes enough instances for a comprehensive evaluation have been labeled. The labeling process has been performed in two steps: First we hired a set of annotators, to label 3D bounding boxes as tracklets in point clouds. Since for a pedestrian tracklet, a single 3D bounding box tracklet (dimensions have been fixed) often fits badly, we additionally labeled the left/right boundaries of each object by making use of Mechanical Turk. We also collected labels of the object's occlusion state, and computed the object's truncation via backprojecting a car/pedestrian model into the image plane. We evaluate submitted results using the common metrics CLEAR MOT and MT/PT/ML. Since there is no single ranking criterion, we do not rank methods. Out development kit provides details about the data format as well as utility functions for reading and writing the label files.

The goal in the object tracking task is to estimate object tracklets for the classes 'Car' and 'Pedestrian'. We evaluate 2D 0-based bounding boxes in each image. We like to encourage people to add a confidence measure for every particular frame for this track. For evaluation we only consider detections/objects larger than 25 pixel (height) in the image and do not count Vans as false positives for cars or Sitting Persons as wrong positives for Pedestrians due to their similarity in appearance. As evaluation criterion we follow the CLEARMOT [1] and Mostly-Tracked/Partly-Tracked/Mostly-Lost [2] metrics. We do not rank methods by a single criterion, but bold numbers indicate the best method for a particular metric. To make the methods comparable, the time for object detection is not included in the specified runtime.
Note: On 01.06.2015 we have fixed several bugs in the evaluation script and also in the calculation of the CLEAR MOT metrics. We have furthermore fixed some problems in the annotations of the training and test set (almost completely occluded objects are no longer counted as false negatives). Furthermore, from now on vans are not counted as false positives for cars and sitting persons not as false positives for pedestrians. We have also improved the devkit with new illustrations and re-calculated the results for all methods. Please download the devkit and the annotations/labels with the improved ground truth for training again if you have downloaded the files prior to 20.05.2015. Please consider reporting these new number for all future submissions. The last leaderboards right before the changes can be found here!
[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

Additional information used by the methods
  • Stereo: Method uses left and right (stereo) images
  • Laser Points: Method uses point clouds from Velodyne laser scanner
  • GPS: Method uses GPS information
  • Online: Online method (frame-by-frame processing, no latency)
  • Additional training data: Use of additional data sources for training (see details)

CAR


Method Setting Code MOTA MOTP MT ML IDS FRAG Runtime Environment
DP_MCF code 36.62 % 78.49 % 11.13 % 39.18 % 2738 3240 0.01 s 1 core @ 2.5 Ghz (Matlab)
H. Pirsiavash, D. Ramanan and C. Fowlkes: Globally-Optimal Greedy Algorithms for Tracking a Variable Number of Objects. IEEE conference on Computer Vision and Pattern Recognition (CVPR) 2011.
HM
This is an online method (no batch processing).
42.22 % 78.42 % 7.77 % 41.92 % 12 577 0.01 s 1 core @ 2.5 Ghz (Python)
MCF 44.28 % 78.32 % 10.98 % 39.94 % 23 590 0.01 s 1 core @ 2.5 Ghz (Python + C/C++)
L. Zhang, Y. Li and R. Nevatia: Global data association for multi-object tracking using network flows.. CVPR .
TBD code 52.44 % 78.47 % 13.87 % 34.30 % 33 538 10 s 1 core @ 2.5 Ghz (Matlab + C/C++)
A. Geiger, M. Lauer, C. Wojek, C. Stiller and R. Urtasun: 3D Traffic Scene Understanding from Movable Platforms. Pattern Analysis and Machine Intelligence (PAMI) 2014.
H. Zhang, A. Geiger and R. Urtasun: Understanding High-Level Semantics by Modeling Traffic Patterns. International Conference on Computer Vision (ICCV) 2013.
SSP 54.65 % 77.78 % 21.34 % 27.13 % 7 715 0.6s 1 core @ 2.7 Ghz (Python)
Anonymous submission
mbodSSP
This is an online method (no batch processing).
52.43 % 77.67 % 15.09 % 29.73 % 0 705 0.01 s 1 core @ 2.7 Ghz (Python)
Anonymous submission
DCO code 35.17 % 74.50 % 10.67 % 33.69 % 223 622 0.03 s 1 core @ >3.5 Ghz (Matlab + C/C++)
A. Andriyenko, K. Schindler and S. Roth: Discrete-Continuous Optimization for Multi-Target Tracking. CVPR 2012.
CEM code 48.23 % 77.26 % 14.48 % 33.99 % 125 398 0.09 s 1 core @ >3.5 Ghz (Matlab + C/C++)
A. Milan, S. Roth and K. Schindler: Continuous Energy Minimization for Multitarget Tracking. IEEE TPAMI 2014.
NOMT 63.27 % 78.32 % 31.55 % 27.59 % 13 155 0.09 s 16 core @ 2.5 Ghz (C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
NOMT-HM
This is an online method (no batch processing).
58.30 % 78.79 % 26.98 % 30.18 % 28 251 0.09 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
SSP* 67.09 % 78.64 % 40.70 % 8.99 % 194 966 0.6 s 1 core @ 2.7 Ghz (Python)
Anonymous submission
mbodSSP*
This is an online method (no batch processing).
67.31 % 78.83 % 34.45 % 10.37 % 117 884 0.01 s 1 core @ 2.7 Ghz (Python)
Anonymous submission
NOMT-HM*
This is an online method (no batch processing).
69.86 % 80.10 % 38.72 % 15.09 % 109 372 0.09 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
SCEA*
This is an online method (no batch processing).
70.24 % 79.52 % 38.72 % 12.65 % 106 467 0.06 s 1 core @ 4.0 Ghz (Matlab + C/C++)
Anonymous submission
SCEA
This is an online method (no batch processing).
54.42 % 78.98 % 19.97 % 29.27 % 17 469 0.05 s 1 core @ 4.0 Ghz (Matlab + C/C++)
Anonymous submission
NOMT* 72.62 % 79.55 % 43.14 % 14.48 % 38 227 0.09 s 16 cores @ 2.5 Ghz (C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
RMOT
This is an online method (no batch processing).
49.87 % 75.33 % 15.24 % 33.54 % 51 385 0.01 s 1 core @ 3.5 Ghz (Matlab)
J. Yoon, M. Yang, J. Lim and K. Yoon: Bayesian Multi-Object Tracking Using Motion Context from Multiple Objects. IEEE Winter Conference on Applications of Computer Vision (WACV) 2015.
DCO_X* code 62.76 % 78.96 % 26.22 % 15.40 % 326 984 0.9 s 1 core @ >3.5 Ghz (Matlab + C/C++)
A. Milan, K. Schindler and S. Roth: Detection- and Trajectory-Level Exclusion in Multiple Object Tracking. CVPR 2013.
RMOT*
This is an online method (no batch processing).
60.46 % 75.57 % 26.98 % 11.13 % 216 742 0.02 s 1 core @ 3.5 Ghz (Matlab)
J. Yoon, M. Yang, J. Lim and K. Yoon: Bayesian Multi-Object Tracking Using Motion Context from Multiple Objects. IEEE Winter Conference on Applications of Computer Vision (WACV) 2015.
ODAMOT
This is an online method (no batch processing).
57.06 % 75.45 % 16.77 % 18.75 % 404 1304 1 s 1 core @ 2.5 Ghz (Python)
A. Gaidon and E. Vig: Online Domain Adaptation for Multi-Object Tracking. British Machine Vision Conference (BMVC) 2015.
LP_SSVM 58.12 % 77.24 % 28.05 % 23.17 % 18 442 0.06 s 1 core @ 2.5 Ghz (Matlab + C/C++)
S. Wang and C. Fowlkes: Learning Optimal Parameters For Multi-target Tracking. British Machine Vision Conference (BMVC) 2015.
SSP_SSVM* 71.19 % 77.70 % 41.77 % 9.60 % 73 568 0.01 s 1 core @ 2.5 Ghz (C/C++)
S. Wang and C. Fowlkes: Learning Optimal Parameters For Multi-target Tracking. British Machine Vision Conference (BMVC) 2015.
FMMOVT 38.37 % 76.84 % 28.35 % 19.66 % 1442 1988 0.05 s 1 core @ 2.5 Ghz (Python)
Anonymous submission
FMMOVT 30.41 % 77.78 % 11.74 % 36.28 % 511 936 0.05 s 1 core @ 2.5 Ghz (C/C++)
Anonymous submission
ANM
This is an online method (no batch processing).
68.12 % 79.02 % 33.54 % 12.96 % 115 798 0.01 s 1 core @ 2.5 Ghz (Matlab)
Anonymous submission
DP_MCF_RCNN 52.83 % 81.34 % 15.40 % 26.22 % 154 845 0.01 s 1 core @ 2.5 Ghz (Python)
Anonymous submission
RCMOT_COR*
This is an online method (no batch processing).
70.24 % 77.87 % 36.59 % 15.85 % 125 403 0.83 s 1 core @ 2.5 Ghz (Matlab)
Anonymous submission
This table as LaTeX

PEDESTRIAN


Method Setting Code MOTA MOTP MT ML IDS FRAG Runtime Environment
CEM code 18.18 % 68.48 % 7.90 % 52.92 % 96 610 0.09 s 1 core @ >3.5 Ghz (Matlab + C/C++)
A. Milan, S. Roth and K. Schindler: Continuous Energy Minimization for Multitarget Tracking. IEEE TPAMI 2014.
NOMT 25.55 % 67.75 % 14.43 % 42.61 % 34 800 0.09 s 16 core @ 2.5 Ghz (C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
NOMT-HM
This is an online method (no batch processing).
17.26 % 67.99 % 11.34 % 51.55 % 73 743 0.09 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
NOMT-HM*
This is an online method (no batch processing).
31.43 % 71.14 % 17.18 % 42.27 % 186 870 0.09 s 8 cores @ 2.5 Ghz (Matlab + C/C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
SCEA*
This is an online method (no batch processing).
39.34 % 71.86 % 14.09 % 43.30 % 56 649 0.06 s 1 core @ 4.0 Ghz (Matlab + C/C++)
Anonymous submission
SCEA
This is an online method (no batch processing).
26.02 % 68.45 % 8.59 % 47.42 % 16 724 0.05 s 1 core @ 4.0 Ghz (Matlab + C/C++)
Anonymous submission
NOMT* 38.98 % 71.45 % 23.37 % 34.71 % 63 672 0.09 s 16 cores @ 2.5 Ghz (C++)
W. Choi: Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor . ICCV 2015.
RMOT
This is an online method (no batch processing).
25.47 % 68.06 % 9.97 % 47.42 % 81 692 0.01 s 1 core @ 3.5 Ghz (Matlab)
J. Yoon, M. Yang, J. Lim and K. Yoon: Bayesian Multi-Object Tracking Using Motion Context from Multiple Objects. IEEE Winter Conference on Applications of Computer Vision (WACV) 2015.
RMOT*
This is an online method (no batch processing).
36.42 % 71.02 % 16.84 % 41.24 % 156 760 0.02 s 1 core @ 3.5 Ghz (Matlab)
J. Yoon, M. Yang, J. Lim and K. Yoon: Bayesian Multi-Object Tracking Using Motion Context from Multiple Objects. IEEE Winter Conference on Applications of Computer Vision (WACV) 2015.
LP_SSVM 22.62 % 67.37 % 10.31 % 43.30 % 69 842 0.06 s 1 core @ 2.5 Ghz (Matlab + C/C++)
S. Wang and C. Fowlkes: Learning Optimal Parameters For Multi-target Tracking. British Machine Vision Conference (BMVC) 2015.
SSP_SSVM* 35.03 % 70.23 % 18.56 % 33.68 % 108 869 0.01 s 1 core @ 2.5 Ghz (C/C++)
S. Wang and C. Fowlkes: Learning Optimal Parameters For Multi-target Tracking. British Machine Vision Conference (BMVC) 2015.
RCMOT_COR*
This is an online method (no batch processing).
35.32 % 70.55 % 17.18 % 45.70 % 147 830 0.83 s 1 core @ 2.5 Ghz (Matlab)
Anonymous submission
This table as LaTeX

Related Datasets

  • TUD Datasets: "TUD Multiview Pedestrians" and "TUD Stadmitte" Datasets.
  • PETS 2009: The Datasets for the "Performance Evaluation of Tracking and Surveillance"" Workshop.
  • EPFL Terrace: Multi-camera pedestrian videos.
  • ETHZ Sequences: Inner City Sequences from Mobile Platforms.

Citation

When using this dataset in your research, we will be happy if you cite us:
@INPROCEEDINGS{Geiger2012CVPR,
  author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
  title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2012}
}



eXTReMe Tracker