arXiv:2406.08479 Abstract | arXiv Analytics

arXiv:2406.08479 [cs.CV]Abstract References Reviews Resources

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Hanwen Jiang, Qixing Huang, Georgios Pavlakos

Published 2024-06-12Version 1

The default strategy for training single-view Large Reconstruction Models (LRMs) follows the fully supervised route using large-scale datasets of synthetic 3D assets or multi-view captures. Although these resources simplify the training procedure, they are hard to scale up beyond the existing datasets and they are not necessarily representative of the real distribution of object shapes. To address these limitations, in this paper, we introduce Real3D, the first LRM system that can be trained using single-view real-world images. Real3D introduces a novel self-training framework that can benefit from both the existing synthetic data and diverse single-view real images. We propose two unsupervised losses that allow us to supervise LRMs at the pixel- and semantic-level, even for training examples without ground-truth 3D or novel views. To further improve performance and scale up the image data, we develop an automatic data curation approach to collect high-quality examples from in-the-wild images. Our experiments show that Real3D consistently outperforms prior work in four diverse evaluation settings that include real and synthetic data, as well as both in-domain and out-of-domain shapes. Code and model can be found here: https://hwjiang1510.github.io/Real3D/

Comments: Project page: https://hwjiang1510.github.io/Real3D/

Categories: cs.CV

Keywords: real-world images, diverse single-view real images, training single-view large reconstruction models, automatic data curation approach, real3d consistently outperforms prior work

Tags: github project

Related articles: Most relevant | Search more

arXiv:2206.01777 [cs.CV] (Published 2022-06-03)

Real-Time Super-Resolution for Real-World Images on Mobile Devices

Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho

arXiv:2203.11799 [cs.CV] (Published 2022-03-22)

AP-BSN: Self-Supervised Denoising for Real-World Images via Asymmetric PD and Blind-Spot Network

Wooseok Lee, Sanghyun Son, Kyoung Mu Lee

arXiv:2404.09389 [cs.CV] (Published 2024-04-15)