Andreas Geiger

Publications of Carolin Schmitt

Towards Scalable Multi-View Reconstruction of Geometry and Materials
C. Schmitt, B. Antic, A. Neculai, J. Lee and A. Geiger
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Abstract: In this paper, we propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be captured with stationary light stages. The input are high-resolution RGB-D images captured by a mobile, hand-held capture system with point lights for active illumination. Compared to previous works that jointly estimate geometry and materials from a hand-held scanner, we formulate this problem using a single objective function that can be minimized using off-the-shelf gradient-based solvers. To facilitate scalability to large numbers of observation views and optimization variables, we introduce a distributed optimization algorithm that reconstructs 2.5D keyframe-based representations of the scene. A novel multi-view consistency regularizer effectively synchronizes neighboring keyframes such that the local optimization results allow for seamless integration into a globally consistent 3D model. We provide a study on the importance of each component in our formulation and show that our method compares favorably to baselines. We further demonstrate that our method accurately reconstructs various objects and materials and allows for expansion to spatially larger scenes. We believe that this work represents a significant step towards making geometry and material estimation from hand-held scanners scalable.
Latex Bibtex Citation:
@article{Schmitt2023PAMI,
  author = {Carolin Schmitt and Bozidar Antic and Andrei Neculai and Joo Ho Lee and Andreas Geiger},
  title = {Towards Scalable Multi-View Reconstruction of Geometry and Materials},
  journal = {Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year = {2023}
}
SMD-Nets: Stereo Mixture Density Networks
F. Tosi, Y. Liao, C. Schmitt and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Abstract: Despite stereo matching accuracy has greatly improved by deep learning in the last few years, recovering sharp boundaries and high-resolution outputs efficiently remains challenging. In this paper, we propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures which ameliorates both issues. Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities while explicitly modeling the aleatoric uncertainty inherent in the observations. Moreover, we formulate disparity estimation as a continuous problem in the image domain, allowing our model to query disparities at arbitrary spatial precision. We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets. Our experiments demonstrate increased depth accuracy near object boundaries and prediction of ultra high-resolution disparity maps on standard GPUs. We demonstrate the flexibility of our technique by improving the performance of a variety of stereo backbones.
Latex Bibtex Citation:
@inproceedings{Tosi2021CVPR,
  author = {Fabio Tosi and Yiyi Liao and Carolin Schmitt and Andreas Geiger},
  title = {SMD-Nets: Stereo Mixture Density Networks},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}
On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner
C. Schmitt, S. Donne, G. Riegler, V. Koltun and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Abstract: We propose a novel formulation for joint recovery of camera pose, object geometry and spatially-varying BRDF. The input to our approach is a sequence of RGB-D images captured by a mobile, hand-held scanner that actively illuminates the scene with point light sources. Compared to previous works that jointly estimate geometry and materials from a hand-held scanner, we formulate this problem using a single objective function that can be minimized using off-the-shelf gradient-based solvers. By integrating material clustering as a differentiable operation into the optimization process, we avoid pre-processing heuristics and demonstrate that our model is able to determine the correct number of specular materials independently. We provide a study on the importance of each component in our formulation and on the requirements of the initial geometry. We show that optimizing over the poses is crucial for accurately recovering fine details and that our approach naturally results in a semantically meaningful material segmentation.
Latex Bibtex Citation:
@inproceedings{Schmitt2020CVPR,
  author = {Carolin Schmitt and Simon Donne and Gernot Riegler and Vladlen Koltun and Andreas Geiger},
  title = {On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials (spotlight)
D. Paschalidou, A. Ulusoy, C. Schmitt, L. Gool and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2018
Abstract: In this paper, we consider the problem of reconstructing a dense 3D model using images captured from different views. Recent methods based on convolutional neural networks (CNN) allow learning the entire task from data. However, they do not incorporate the physics of image formation such as perspective geometry and occlusion. Instead, classical approaches based on Markov Random Fields (MRF) with ray-potentials explicitly model these physical processes, but they cannot cope with large surface appearance variations across different viewpoints. In this paper, we propose RayNet, which combines the strengths of both frameworks. RayNet integrates a CNN that learns view-invariant feature representations with an MRF that explicitly encodes the physics of perspective projection and occlusion. We train RayNet end-to-end using empirical risk minimization. We thoroughly evaluate our approach on challenging real-world datasets and demonstrate its benefits over a piece-wise trained baseline, hand-crafted models as well as other learning-based approaches.
Latex Bibtex Citation:
@inproceedings{Paschalidou2018CVPR,
  author = {Despoina Paschalidou and Ali Osman Ulusoy and Carolin Schmitt and Luc Gool and Andreas Geiger},
  title = {RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2018}
}


eXTReMe Tracker