Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road [GPP]

Submitted on 12 Jun. 2019 06:36 by
Akshay Rangesh (UC San Diego)

Running time:0.23 s
Environment:GPU @ 1.5 Ghz (Python + C/C++)

Method Description:
First, we train a single-stage convolutional neural network (CNN) that produces multiple visual and geometric cues of interest: 2D bounding
boxes, 2D keypoints of interest, coarse object orientations and object dimensions. Subsets of these cues are then used to poll probable
ground planes from a pre-computed database of ground planes, to identify the “best fit” plane with highest consensus. Once identified, the
“best fit” plane provides enough constraints to successfully construct the desired 3D detection box, without directly predicting the 6DoF
pose of the object. The entire ground plane polling (GPP) procedure is constructed as a non-parametrized layer of the CNN that outputs the
desired “best fit” plane and the corresponding 3D keypoints, which together define the final 3D bounding box.
Latex Bibtex:
title={Ground plane polling for 6dof pose estimation of
objects on the road},
author={Rangesh, Akshay and Trivedi, Mohan M},
journal={arXiv preprint arXiv:1811.06666},

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).

Benchmark Easy Moderate Hard
Car (Detection) 94.02 % 89.96 % 81.13 %
Car (Orientation) 93.94 % 89.68 % 80.60 %
This table as LaTeX

2D object detection results.
This figure as: png eps pdf txt gnuplot

Orientation estimation results.
This figure as: png eps pdf txt gnuplot

eXTReMe Tracker