The KITTI Vision Benchmark Suite

Method

Video-kMaX [Video-kMaX]

Submitted on 11 Nov. 2022 22:36 by
Inkyu Shin (KAIST)

Running time:		1 s
Environment:		1 core @ 2.5 Ghz (C/C++)

Method Description:

Video-kMaX consists of two components: withinclip
segmenter (for clip-level segmentation) and cross-
clip associater (for association beyond clips). We
propose clipkMaX (clip k-means mask transformer) and
HiLA-MB (Hierarchical Location-Aware Memory Buffer)
to instantiate the segmenter and associater,
respectively.

Parameters:

The clip length is set to 2. Runtime is not
measured.

Latex Bibtex:

@article{shin2023video,
title={Video-kMaX: A simple unified approach for
online and near-online video panoptic segmentation},
author={Shin, Inkyu and Kim, Dahun and Yu, Qihang
and Xie, Jun and Kim, Hong-Seok and Green, Bradley
and Kweon, In So and Yoon, Kuk-Jin and Chen, Liang-
Chieh},
journal={arXiv preprint arXiv:2304.04694},
year={2023}
}

Detailed Results

From all 29 test sequences, our benchmark computes the STQ segmentation and tracking metric (STQ, AQ, SQ (IoU)). The tables below show all of these metrics.

Benchmark	STQ	AQ	SQ (IoU)
KITTI-STEP	68.47 %	67.20 %	69.77 %

This table as LaTeX

The KITTI Vision Benchmark Suite

A project of Karlsruhe Institute of Technologyand Toyota Technological Institute at Chicago

Method

Detailed Results

A project of Karlsruhe Institute of Technology
and Toyota Technological Institute at Chicago