Vision Group

3D Urban Scene Understanding

In this project, we investigate probabilistic generative model for multi-object traffic scene understanding from movable platforms which reason jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, we are interested in inferring the scene topology, geometry and traffic activities from short video sequences. Inspired by the impressive driving capabilities of humans, our models do not rely on GPS, lidar or map knowledge. Instead, they take advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow and occupancy grids. For each of these cues we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that we are able to successfully infer the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.

Here is a short overview of the relevant publications. For bibtex citations please scroll further down this page!


Code and data are provided under the GNU General Public License and the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.


We will be happy if you cite the following articles which are related to this project:
  author = {Andreas Geiger and Martin Lauer and Christian Wojek and Christoph Stiller and Raquel Urtasun},
  title = {3D Traffic Scene Understanding from Movable Platforms},
  journal = {Pattern Analysis and Machine Intelligence (PAMI)},
  year = {2014}
  author = {Andreas Geiger},
  title = {Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms},
  school = {KIT},
  year = {2013}
  author = {Hongyi Zhang and Andreas Geiger and Raquel Urtasun},
  title = {Understanding High-Level Semantics by Modeling Traffic Patterns},
  booktitle = {International Conference on Computer Vision (ICCV)},
  year = {2013}
  author = {Andreas Geiger and Christian Wojek and Raquel Urtasun},
  title = {Joint 3D Estimation of Objects and Scene Layout},
  booktitle = {Advances in Neural Information Processing Systems (NIPS)},
  year = {2011}
  author = {Andreas Geiger and Martin Lauer and Raquel Urtasun},
  title = {A Generative Model for 3D Urban Scene Understanding from Movable Platforms},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2011}

eXTReMe Tracker