The KITTI Vision Benchmark Suite

Method

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking? [PolarMOT]
https://polarmot.github.io/

Submitted on 12 Dec. 2021 02:16 by
Aleksandr Kim (Technical University of Munich)

Running time:		0.02 s
Environment:		1 core @ 2.5 Ghz (C/C++)

Method Description:

State-of-the-art generalizable multi-object tracking as edge classification on a continuously evolved temporal multiplex graph, which contains only pairwise geometric relationships between objects (temporal and spatial) as its initial edge features.

We encode 3D detections as nodes in a graph, where spatial and temporal pairwise relations among objects are encoded via localized polar coordinates on graph edges. This representation makes our geometric relations invariant to global transformations and smooth trajectory changes, especially under non-holonomic motion. This allows our graph neural network to learn to effectively encode temporal and spatial interactions and fully leverage contextual and motion cues to obtain final scene interpretation by posing data association as edge classification. We establish a new state-of-the-art on nuScenes and show that PolarMOT generalizes remarkably well across different locations (Boston, Singapore) and datasets (nuScenes and KITTI).

Parameters:

See paper

Latex Bibtex:

@inproceedings{polarmot,
author = {Aleksandr Kim and Guillem Bras{'o} and Aljo\v{s}a O\v{s}ep and Laura Leal-Taix{'e}},
title = {PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2022},
}

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.

Benchmark	MOTA	MOTP	MODA	MODP
CAR	85.31 %	85.52 %	86.49 %	88.57 %
PEDESTRIAN	47.25 %	64.87 %	48.29 %	88.92 %

Benchmark	recall	precision	F1	TP	FP	FN	FAR	#objects	#trajectories
CAR	92.67 %	94.40 %	93.53 %	33554	1990	2655	17.89 %	42674	1686
PEDESTRIAN	63.58 %	80.99 %	71.24 %	14823	3480	8491	31.28 %	21750	1407

Benchmark	MT	PT	ML	IDS	FRAG
CAR	81.38 %	16.31 %	2.31 %	408	900
PEDESTRIAN	30.24 %	51.20 %	18.56 %	241	1375

This table as LaTeX

[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago