Andreas Geiger

Publications of Stefano Esposito

ConeGS: Error–Guided Densification Using Pixel Cones for Improved Reconstruction with Fewer Primitives (oral)
B. Baranowski, S. Esposito, P. Gschoßmann, A. Chen and A. Geiger
International Conference on 3D Vision (3DV), 2026
Abstract: 3D Gaussian Splatting (3DGS) achieves state-of-the-art image quality and real-time performance in novel view synthesis but often suffers from a suboptimal spatial distribution of primitives. This issue stems from cloning-based densification, which propagates Gaussians along existing geometry, limiting exploration and requiring many primitives to adequately cover the scene. We present ConeGS, an image-space-informed densification framework that is independent of existing scene geometry state. ConeGS first creates a fast Instant Neural Graphics Primitives (iNGP) reconstruction as a geometric proxy to estimate per-pixel depth. During the subsequent 3DGS optimization, it identifies high-error pixels and inserts new Gaussians along the corresponding viewing cones at the predicted depth values, initializing their size according to the cone diameter. A pre-activation opacity penalty rapidly removes redundant Gaussians, while a primitive budgeting strategy controls the total number of primitives, either by a fixed budget or by adapting to scene complexity, ensuring high reconstruction quality. Experiments show that ConeGS consistently enhances reconstruction quality and rendering performance across Gaussian budgets, with especially strong gains under tight primitive constraints where efficient placement is crucial.
Latex Bibtex Citation:
@inproceedings{Baranowski2026THREEDV,
  author = {Bartłomiej Baranowski and Stefano Esposito and Patricia Gschoßmann and Anpei Chen and Andreas Geiger},
  title = {ConeGS: Error–Guided Densification Using Pixel Cones for Improved Reconstruction with Fewer Primitives},
  booktitle = {International Conference on 3D Vision (3DV)},
  year = {2026}
}
Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes
S. Esposito, A. Chen, C. Reiser, S. Bulò, L. Porzi, K. Schwarz, C. Richardt, M. Zollhöfer, P. Kontschieder and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2025
Abstract: High-quality view synthesis relies on volume rendering, splatting, or surface rendering. While surface rendering is typically the fastest, it struggles to accurately model fuzzy geometry like hair. In turn, alpha-blending techniques excel at representing fuzzy materials but require an unbounded number of samples per ray (P1). Further overheads are induced by empty space skipping in volume rendering (P2) and sorting input primitives in splatting (P3). We present a novel representation for real-time view synthesis where the (P1) number of sampling locations is small and bounded, (P2) sampling locations are efficiently found via rasterization, and (P3) rendering is sorting-free. We achieve this by representing objects as semi-transparent multi-layer meshes rendered in a fixed order. First, we model surface layers as signed distance function (SDF) shells with optimal spacing learned during training. Then, we bake them as meshes and fit UV textures. Unlike single-surface methods, our multi-layer representation effectively models fuzzy objects. In contrast to volume and splatting-based methods, our approach enables real-time rendering on low-power laptops and smartphones.
Latex Bibtex Citation:
@inproceedings{Esposito2025CVPR,
  author = {Stefano Esposito and Anpei Chen and Christian Reiser and Samuel Rota Bulò and Lorenzo Porzi and Katja Schwarz and Christian Richardt and Michael Zollhöfer and Peter Kontschieder and Andreas Geiger},
  title = {Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2025}
}
LaRa: Efficient Large-Baseline Radiance Fields
A. Chen, H. Xu, S. Esposito, S. Tang and A. Geiger
European Conference on Computer Vision (ECCV), 2024
Abstract: Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction. But they are mostly applied in per-scene optimization or small-baseline settings. While several recent works investigate feed-forward reconstruction with large baselines by utilizing transformers, they all operate with a standard global attention mechanism and hence ignore the local nature of 3D reconstruction. We propose a method that unifies local and global reasoning in transformer layers, resulting in improved quality and faster convergence. Our model represents scenes as Gaussian Volumes and combines this with an image encoder and Group Attention Layers for efficient feed-forward reconstruction. Experimental results demonstrate that our model, trained for two days on four GPUs, demonstrates high fidelity in reconstructing 360° radiance fields, and robustness to zero-shot and out-of-domain testing.
Latex Bibtex Citation:
@inproceedings{Chen2024ECCV,
  author = {Anpei Chen and Haofei Xu and Stefano Esposito and Siyu Tang and Andreas Geiger},
  title = {LaRa: Efficient Large-Baseline Radiance Fields},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year = {2024}
}


eXTReMe Tracker