Method

ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape [ROI-10D]


Submitted on 17 Nov. 2018 07:47 by
Allan Raventos (Stanford University)

Running time:0.2 s
Environment:GPU @ 3.5 Ghz (Python)

Method Description:
We present a deep learning method for end-to-end monocular 3D object detection and metric shape retrieval. We propose a novel loss formulation by lifting 2D detection, orientation, and scale estimation into 3D space. Instead of optimizing these quantities separately, the 3D instantiation allows to properly measure the metric misalignment of boxes.
Parameters:
Lr=0.001
SGD with Momentum
Latex Bibtex:
@inproceedings{manhardt2018roi10d,
title={ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape},
author={Manhardt, Fabian and Kehl, Wadim and Gaidon, Adrien},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2019},
organization={IEEE}
}

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 75.33 % 69.64 % 61.18 %
Car (Orientation) 74.24 % 67.85 % 59.28 %
Car (3D Detection) 12.30 % 10.30 % 9.39 %
Car (Bird's Eye View) 16.77 % 12.40 % 11.39 %
This table as LaTeX


2D object detection results.
This figure as: png eps pdf txt gnuplot



Orientation estimation results.
This figure as: png eps pdf txt gnuplot



3D object detection results.
This figure as: png eps pdf txt gnuplot



Bird's eye view results.
This figure as: png eps pdf txt gnuplot




eXTReMe Tracker