Abstract
The core of self-supervised point cloud learning lies in setting up appropriate pretext tasks, to construct a pre-training framework that enables the encoder to perceive 3D objects effectively. In this article, we integrate two prevalent methods, masked point modeling (MPM) and 3D-to-2D generation, as pretext tasks within a pre-training framework. We leverage the spatial awareness and precise supervision offered by these two methods to address their respective limitations: ambiguous supervision signals and insensitivity to geometric information. Specifically, the proposed framework, abbreviated as PointCG, consists of a Hidden Point Completion (HPC) module and an Arbitrary-view Image Generation (AIG) module. We first capture visible points from arbitrary views as inputs by removing hidden points. Then, HPC extracts representations of the inputs with an encoder and completes the entire shape with a decoder, while AIG is used to generate rendered images based on the visible points’ representations. Extensive experiments demonstrate the superiority of the proposed method over the baselines in various downstream tasks. Our code will be made available upon acceptance.
| Original language | English |
|---|---|
| Pages (from-to) | 6648-6660 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Visualization and Computer Graphics |
| Volume | 31 |
| Issue number | 10 |
| DOIs | |
| Publication status | Published - 2025 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl. Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Keywords
- arbitrary-view image generation
- hidden point completion
- point clouds
- PointCG
- self-supervised learning