The KITTI Vision Benchmark Suite

Method

TANet: Robust 3D Object Detection from Point Clouds with Triple Attention [TANet]
https://github.com/happinesslz/TANet.git

Submitted on 2 Sep. 2019 18:16 by
Zhe Liu (Huazhong University of Science and Technology)

Running time:		0.035s
Environment:		GPU @ 2.5 Ghz (Python + C/C++)

Method Description:

In this paper, we focus on exploring the
robustness of the3D object detection in point
clouds, which has been rarely discussed in
existing approaches. We observe two crucial
phenomena: 1) the detection accuracy of the
hard objects,e.g., Pedestrians, is
unsatisfactory, 2) when adding additional noise
points, the performance of existing
approaches de-creases rapidly. To alleviate these
problems, a novel TANet is introduced in this
paper, which mainly contains a Triple Attention
(TA) module, and a Coarse-to-Fine Regression
(CFR)module. By considering the channel-wise,
point-wise and voxel-wise attention jointly, the
TA module enhances the crucial information of
the target while suppresses the unstable
cloud points. Besides, the novel stacked TA
further exploits the multi-level feature
attention. In addition, the CFR module boosts the
accuracy of localization without excessive
computation cost. Experimental results on the
validation set of KITTI dataset demonstrate that,
in the challenging noisy cases, i.e., adding
additional random noisy points around each object,
the presented approach goes far beyond state-of-
the-art approaches, especially for the Pedestrian
class.The running speed is around 29 frames per
second.

Parameters:

See the paper

Latex Bibtex:

@article{liu2019tanet,
title={TANet: Robust 3D Object Detection from
Point Clouds with Triple Attention},
author={Zhe Liu and Xin Zhao and Tengteng
Huang and Ruolan Hu and Yu Zhou and Xiang Bai},
year={2020},
journal={AAAI},
url={https://arxiv.org/pdf/1912.05163.pdf},
eprint={1912.05163},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).

Benchmark	Easy	Moderate	Hard
Car (Detection)	93.67 %	90.67 %	85.31 %
Car (Orientation)	93.52 %	90.11 %	84.61 %
Car (3D Detection)	84.39 %	75.94 %	68.82 %
Car (Bird's Eye View)	91.58 %	86.54 %	81.19 %
Pedestrian (Detection)	69.90 %	59.07 %	56.44 %
Pedestrian (Orientation)	42.54 %	36.21 %	34.39 %
Pedestrian (3D Detection)	53.72 %	44.34 %	40.49 %
Pedestrian (Bird's Eye View)	60.85 %	51.38 %	47.54 %
Cyclist (Detection)	82.24 %	68.20 %	62.13 %
Cyclist (Orientation)	81.15 %	66.37 %	60.10 %
Cyclist (3D Detection)	75.70 %	59.44 %	52.53 %
Cyclist (Bird's Eye View)	79.16 %	63.77 %	56.21 %