Andreas Geiger

Publications of Despoina Paschalidou

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
D. Paschalidou, A. Katharopoulos, A. Geiger and S. Fidler
Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Abstract: Impressive progress in 3D shape extraction led to representations that can capture object geometries with high fidelity. In parallel, primitive-based methods seek to represent objects as semantically consistent part arrangements. However, due to the simplicity of existing primitive representations, these methods fail to accurately reconstruct 3D shapes using a small number of primitives/parts. We address the trade-off between reconstruction quality and number of parts with Neural Parts, a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN) which implements homeomorphic mappings between a sphere and the target object. The INN allows us to compute the inverse mapping of the homeomorphism, which in turn, enables the efficient computation of both the implicit surface function of a primitive and its mesh, without any additional post-processing. Our model learns to parse 3D objects into semantically consistent part arrangements without any part-level supervision. Evaluations on ShapeNet, D-FAUST and FreiHAND demonstrate that our primitives can capture complex geometries and thus simultaneously achieve geometrically accurate as well as interpretable reconstructions using an order of magnitude fewer primitives than state-of-the-art shape abstraction methods.
Latex Bibtex Citation:
@INPROCEEDINGS{Paschalidou2021CVPR,
  author = {Despoina Paschalidou and Angelos Katharopoulos and Andreas Geiger and Sanja Fidler},
  title = {Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}
Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image
D. Paschalidou, L. Gool and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Abstract: Humans perceive the 3D world as a set of distinct objects that are characterized by various low-level (geometry, reflectance) and high-level (connectivity, adjacency, symmetry) properties. Recent methods based on convolutional neural networks (CNNs) demonstrated impressive progress in 3D reconstruction, even when using a single 2D image as input. However, the majority of these methods focuses on recovering the local 3D geometry of an object without considering its part-based decomposition or relations between parts. We address this challenging problem by proposing a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives as well as their latent hierarchical structure without part-level supervision. Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives, where simple parts are represented with fewer primitives and more complex parts are modeled with more components. Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
Latex Bibtex Citation:
@INPROCEEDINGS{Paschalidou2020CVPR,
  author = {Despoina Paschalidou and Luc Gool and Andreas Geiger},
  title = {Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}
}
PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds
A. Behl, D. Paschalidou, S. Donne and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Abstract: Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.
Latex Bibtex Citation:
@INPROCEEDINGS{Behl2019CVPR,
  author = {Aseem Behl and Despoina Paschalidou and Simon Donne and Andreas Geiger},
  title = {PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2019}
}
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids
D. Paschalidou, A. Ulusoy and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Abstract: Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.
Latex Bibtex Citation:
@INPROCEEDINGS{Paschalidou2019CVPR,
  author = {Despoina Paschalidou and Ali Osman Ulusoy and Andreas Geiger},
  title = {Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2019}
}
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials (spotlight)
D. Paschalidou, A. Ulusoy, C. Schmitt, L. Gool and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2018
Abstract: In this paper, we consider the problem of reconstructing a dense 3D model using images captured from different views. Recent methods based on convolutional neural networks (CNN) allow learning the entire task from data. However, they do not incorporate the physics of image formation such as perspective geometry and occlusion. Instead, classical approaches based on Markov Random Fields (MRF) with ray-potentials explicitly model these physical processes, but they cannot cope with large surface appearance variations across different viewpoints. In this paper, we propose RayNet, which combines the strengths of both frameworks. RayNet integrates a CNN that learns view-invariant feature representations with an MRF that explicitly encodes the physics of perspective projection and occlusion. We train RayNet end-to-end using empirical risk minimization. We thoroughly evaluate our approach on challenging real-world datasets and demonstrate its benefits over a piece-wise trained baseline, hand-crafted models as well as other learning-based approaches.
Latex Bibtex Citation:
@INPROCEEDINGS{Paschalidou2018CVPR,
  author = {Despoina Paschalidou and Ali Osman Ulusoy and Carolin Schmitt and Luc Gool and Andreas Geiger},
  title = {RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2018}
}


eXTReMe Tracker