Method

contextual-aware transformer for 3D point cloud automatic annotation [CAT]
[Anonymous Submission]

Submitted on 30 Jul. 2022 04:35 by
[Anonymous Submission]

Running time:1 s
Environment:1 core @ 2.5 Ghz (Python)

Method Description:
3D point cloud annotation is laborious, spurring
the development of 3D automatic annotation.
However, to preserve the high-quality 3D
annotation, current methods usually introduce
complicated annotation pipelines, e.g., staged
training for 3D foreground/background
segmentation, cylindrical object proposals, or
point completion. We propose a simple yet
effective end-to-end Context-Aware Transformer
(CAT) as an automated 3D-box labeler, only
requiring a small set of labeled 3D point clouds
and weak 2D boxes. Compared to existing annotation
methods, CAT requires minimal modifications to the
vanilla Transformer block. We adopt a Transformer
encoder consisting of local and global parts and a
standard Transformer decoder. Specifically, the
local encoder explicitly models long-range
contexts among points at an object level, and the
global encoder captures contextual interactions
across objects for point feature learning.
Parameters:
depth=[6,3,3]
embed_dim=512
num_points=1024
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 95.94 % 92.42 % 87.36 %
Car (Orientation) 95.84 % 92.04 % 86.83 %
Car (3D Detection) 84.84 % 75.22 % 70.05 %
Car (Bird's Eye View) 91.48 % 85.97 % 80.93 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker