Semantic Segmentation Evaluation



This is the KITTI semantic segmentation benchmark. It consists of 200 semantically annotated train as well as 200 test images corresponding to the KITTI Stereo and Flow Benchmark 2015. The data format and metrics are conform with The Cityscapes Dataset.

The data can be downloaded here:


Note: On 12.04.2018 we have fixed several annotation errors in the dataset, please download the dataset again if you have an old version.


Our evaluation table ranks all methods according to the PASCAL VOC intersection-over-union metric (IoU). IoU = TP/(TP+FP+FN), where TP, FP, and FN are the numbers of true positive, false positive, and false negative pixels, respectively. Like in cityscapes we also use an instance-level intersection over union iIoU = iTP/(iTP+FP+iFN). In contrast to the standard IoU measure, iTP and iFN are computed by weighting the contribution of each pixel by the ratio of the class’ average instance size to the size of the respective ground truth instance.

  • IoU class:  Intersection over Union for each class IoU=TP/(TP+FP+FN)
  • iIoU class:    Instance Intersection over Union iIoU=iTP/(iTP+FP+iFN)
  • IoU category:   Intersection over Union for each category IoU=TP/(TP+FP+FN)
  • iIoU category:     Instance Intersection over Union for each category iIoU=iTP/(iTP+FP+iFN)


Additional information used by the methods
  • Laser Points: Method uses point clouds from Velodyne laser scanner
  • Depth: Method uses depth from stereo.
  • Video: Method uses 2 or more temporally adjacent images
  • Additional training data: Use of additional data sources for training (see details)


Method Setting Code IoU class iIoU class IoU category iIoU category Runtime Environment
1 MapillaryAI_ROB 69.56 43.17 86.52 68.89 n s 1 core @ 2.5 Ghz (C/C++)
2 IBN-PSP-SA_ROB 67.51 34.23 87.04 66.33 n s GPU @ 2.5 Ghz (Python)
3 ifly 66.77 33.96 85.05 62.25 1 s 1 core @ 2.5 Ghz (C/C++)
4 Mapillary_ROB 66.65 41.64 86.06 68.13 n s 1 core @ 2.5 Ghz (Python + C/C++)
5 LDN2_ROB code 63.51 28.31 85.34 59.07 1 s GPU @ 2.5 Ghz (C/C++)
6 AHiSS_ROB 61.24 26.94 81.54 53.42 0.06 s GPU @ 1.5 Ghz (Python)
7 SegStereo 59.10 28.00 81.31 60.26 0.6 s Nvidia GTX Titan Xp
8 VENUS_ROB code 59.05 28.75 80.94 60.09 0.5 s 8 cores @ 2.5 Ghz (Python)
9 AdapNetv2_ROB 54.97 25.20 81.64 56.31 1 s 1 core @ 2.5 Ghz (C/C++)
10 VlocNet++_ROB 53.92 23.68 80.74 53.66 n s 1 core @ 2.5 Ghz (C/C++)
11 HiSS_ROB 53.16 21.37 78.32 51.92 0.06 s GPU @ 1.5 Ghz (Python)
12 APMoE_seg_ROB code 47.96 17.86 78.11 49.17 0.2 s GPU @ 3.5 Ghz (Matlab/C++)
S. Kong and C. Fowlkes: Pixel-wise Attentional Gating for Parsimonious Pixel Labeling. arxiv 1805.01556 2018.
13 BatMAN_ROB 47.36 16.79 78.43 50.03 0.2 s GPU @ 1.5 Ghz (Python)
14 GoogLeNetV1_ROB 45.29 18.75 74.44 47.61 0.05 s 1 core @ 2.5 Ghz (C/C++)
15 GoogLeV1_CS 43.63 16.40 71.83 41.63 0.03 ms Titan Xp
16 FCN101_ROB 24.57 6.19 51.85 22.81 0.07 s 2 cores @ 3.5 Ghz (Python)
17 Deleted 15.49 2.76 34.02 10.39 1 s 1 core @ 2.5 Ghz (C/C++)
Table as LaTeX | Only published Methods




Related Datasets

  • The Cityscapes Dataset: The cityscapes dataset was recorded in 50 German cities and offers high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames.
  • Wilddash: Wilddash is a benchmark for semantic and instance segmentation. It aims to improve the expressiveness of performance evaluation for computer vision algorithms in regard to their robustness under real-world conditions.

Citation

When using this dataset in your research, we will be happy if you cite us:
@INPROCEEDINGS{Alhaija2017BMVC,
  author = {Hassan Abu Alhaija and Siva Karthik Mustikovela and Lars Mescheder and Andreas Geiger and Carsten Rother},
  title = {Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes},
  booktitle = {British Machine Vision Conference (BMVC)},
  year = {2017}
}



eXTReMe Tracker