Method

SAM2-based Multi-object Tracking and Segmentation using Zero-shot Learning [Seg2Track-SAM2]
github.com/hcmr-lab/Seg2Track-SAM2

Submitted on 9 Sep. 2025 16:24 by
Diogo Mendonça (Universidade de Coimbra)

Running time:1 s
Environment:GPU @ 1.5 Ghz (Python)

Method Description:
This method extends SAM2 to multi-object tracking
and segmentation in a zero-shot setting. Objects are
initialized with a detector and refined over time
through object reinforcement, ensuring consistent
masks across frames without extra training.
Parameters:
\detection_threshold=0.5
\removal_threshold=0.1
Latex Bibtex:
@misc{mendonça2025seg2tracksam2sam2basedmultiobjecttracking,
title={Seg2Track-SAM2: SAM2-based Multi-object Tracking and Segmentation for Zero-shot Generalization},
author={Diogo Mendonça and Tiago Barros and Cristiano Premebida and Urbano J. Nunes},
year={2025},
eprint={2509.11772},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.11772},
}

Detailed Results

From all 29 test sequences, our benchmark computes the commonly used tracking metrics (adapted for the segmentation case): CLEARMOT, MT/PT/ML, identity switches, and fragmentations [1,2]. The tables below show all of these metrics.


Benchmark sMOTSA MOTSA MOTSP MODSA MODSP
CAR 68.70 % 81.00 % 86.20 % 81.20 % 88.90 %
PEDESTRIAN 49.70 % 68.10 % 77.40 % 68.50 % 92.70 %

Benchmark recall precision F1 TP FP FN FAR #objects #trajectories
CAR 88.70 % 92.20 % 90.40 % 32616 2757 4144 24.80 % 47510 989
PEDESTRIAN 81.60 % 86.20 % 83.80 % 16882 2711 3815 24.40 % 27345 417

Benchmark MT PT ML IDS FRAG
CAR 71.60 % 25.70 % 2.70 % 95 302
PEDESTRIAN 56.30 % 29.60 % 14.10 % 79 326

This table as LaTeX


[1] K. Bernardin, R. Stiefelhagen: Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. JIVP 2008.
[2] Y. Li, C. Huang, R. Nevatia: Learning to associate: HybridBoosted multi-target tracker for crowded scene. CVPR 2009.


eXTReMe Tracker