Method

Cascaded cross-modality fusion network for 3D object Dection [CCFNET]


Submitted on 10 Nov. 2020 14:23 by
zhiyu chen (NJUPT)

Running time:0.1 s
Environment:1 core @ 2.5 Ghz (C/C++)

Method Description:
We focus on exploring the LIDAR-RGB fusion based 3D object detection in this paper.
Unlike classsical 3D object detection which depends on RGB or LIDAR solely to predict the location
and category of target in 3D space, LIDAR-RGB fusion based 3D object detection utilizes both of
sensors to capture more reliable and discriminative information of the environment. However,
this task is still challenging in two aspects: 1) the difference of data formats and sensor positions
contributes to the misalignment of reasoning between the semantic features of images and the
geometric features of point cloud. 2) The optimization of traditional IoU is not equal to the regression
loss of bounding boxes, resulting in baised back-popagation for non-overlapping cases. In this
work, we propose a Cascaded Cross-modality Fusion Network (CCFNet) which includes cascaded
multi-scale fusion module (CMF) and a novel center 3D IoU loss to resolve these two issues.
Parameters:
N\A
Latex Bibtex:
N\A

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 95.85 % 92.25 % 89.36 %
Car (Orientation) 95.79 % 91.90 % 88.82 %
Car (3D Detection) 88.20 % 78.97 % 74.14 %
Car (Bird's Eye View) 94.25 % 87.97 % 83.27 %
This table as LaTeX


2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot




eXTReMe Tracker