The KITTI Vision Benchmark Suite

Method

Robust 3D Multi-Object Tracking for Autonomous Driving with Adaptive LiDAR-Visual Fusion and Multile [FCOMOT(h)]

Submitted on 6 Aug. 2024 10:13 by
(Anonymous)

Running time:		0.01 s
Environment:		1 core @ 2.5 Ghz (C/C++)

Method Description:

To increase the safety and reliability of
autonomous driving systems in complex traffic
environments, this paper proposes a novel 3D
multiobject tracking (MOT) method that integrates
center-plane adaptive multisensor fusion, motion
compensation, and multilevel data association.
Unlike traditional methods, our approach employs a
center-plane adaptive fusion strategy to align
LiDAR and visual data precisely, mitigating errors
in the target width caused by pose variations, and
improving tracking accuracy. To address vehicle
motion-induced association errors in dynamic
scenarios, we incorporate IMU and GPS data for
high-frequency vehicle pose estimation and
compensation, ensuring stable and robust target
association. Additionally, a rotational geometric
distance intersection-over-union (RGDIoU) cost
function is introduced, combined with multilevel
spatial indexing, to optimize the data association
efficiency and accuracy. The experimental results
on benchmark datasets, including KITTI and

Parameters:

Latex Bibtex:

@inproceedings{zhang2026motion,
title = {Robust 3D Multi-Object Tracking for
Autonomous Driving with Adaptive LiDAR-Visual
Fusion and Multile},
author = {*},
booktitle = {},
year = {2026},
note = {Submitted for review; under
consideration},
}

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.

Benchmark	MOTA	MOTP	MODA	MODP
CAR	80.83 %	78.73 %	80.87 %	83.71 %
PEDESTRIAN	58.48 %	71.14 %	60.47 %	90.74 %

Benchmark	recall	precision	F1	TP	FP	FN	FAR	#objects	#trajectories
CAR	90.87 %	91.71 %	91.29 %	34470	3116	3462	28.01 %	42685	1651
PEDESTRIAN	77.76 %	82.32 %	79.97 %	18271	3925	5226	35.28 %	27439	1002

Benchmark	MT	PT	ML	IDS	FRAG
CAR	73.85 %	22.92 %	3.23 %	16	330
PEDESTRIAN	53.26 %	36.77 %	9.97 %	460	1323

This table as LaTeX

[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago