Submitted on 11 Nov. 2022 22:36 by
Inkyu Shin (KAIST)

Running time:1 s
Environment:1 core @ 2.5 Ghz (C/C++)

Video-kMaX consists of two components: withinclip
segmenter (for clip-level segmentation) and cross-
clip associater (for association beyond clips). We
propose clipkMaX (clip k-means mask transformer) and
HiLA-MB (Hierarchical Location-Aware Memory Buffer)
to instantiate the segmenter and associater,
The clip length is set to 2. Runtime is not
Detailed Results

From all 29 test sequences, our benchmark computes the STQ segmentation and tracking metric (STQ, AQ, SQ (IoU)). The tables below show all of these metrics.

Benchmark STQ AQ SQ (IoU)
KITTI-STEP 68.47 % 67.20 % 69.77 %

