The KITTI Vision Benchmark Suite

Method

MOTSFusion (Pedestrians) [MOTSFusion]
https://github.com/tobiasfshr/MOTSFusion

Submitted on 5 Dec. 2019 22:02 by
Jonathon Luiten (RWTH Aachen University)

Running time:		0.44 s
Environment:		1 core @ 2.5 Ghz (C/C++)

Method Description:

First we build tracklets by calculating a
segmentation mask for each detection and linking
these over time using optical flow. We then fuse
these tracklets into 3D object reconstuctions
using depth and ego motion estimates. These 3D
reconstructions are then used to estimate the 3D
motion of objects, which is used to merge
tracklets into long-term tracks, bridging
occlusion gaps of up to 20 frames. This also
allows us to fill in missing detections.

Parameters:

Detections = TrackRCNN
Segmentations = BB2SegNet

Latex Bibtex:

@article{luiten2019MOTSFusion,
title={Track to Reconstruct and Reconstruct to
Track},
author={Luiten, Jonathon and Fischer, Tobias and
Leibe, Bastian},
journal={IEEE Robotics and Automation Letters},
year={2020},
publisher={IEEE}
}

Detailed Results

From all 29 test sequences, our benchmark computes the HOTA tracking metrics (HOTA, DetA, AssA, DetRe, DetPr, AssRe, AssPr, LocA) [1] as well as the CLEARMOT, MT/PT/ML, identity switches, and fragmentation [2,3] metrics. The tables below show all of these metrics.

Benchmark	HOTA	DetA	AssA	DetRe	DetPr	AssRe	AssPr	LocA
PEDESTRIAN	54.04 %	60.83 %	49.45 %	64.13 %	81.47 %	56.68 %	70.44 %	83.71 %

Benchmark	TP	FP	FN
PEDESTRIAN	15829	4868	463

Benchmark	MOTSA	MOTSP	MODSA	IDSW	sMOTSA
PEDESTRIAN	72.89 %	81.50 %	74.24 %	279	58.75 %

Benchmark	MT rate	PT rate	ML rate	FRAG
PEDESTRIAN	47.41 %	37.04 %	15.56 %	522

Benchmark	# Dets	# Tracks
PEDESTRIAN	16292	293

This table as LaTeX

This figure as: png pdf

[1] J. Luiten, A. Os̆ep, P. Dendorfer, P. Torr, A. Geiger, L. Leal-Taixé, B. Leibe: HOTA: A Higher Order Metric for Evaluating Multi-object Tracking. IJCV 2020.
[2] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[3] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago