arXiv Analytics

Sign in

arXiv:2207.10662 [cs.CV]AbstractReferencesReviewsResources

Generalizable Patch-Based Neural Rendering

Mohammed Suhail, Carlos Esteves, Leonid Sigal, Ameesh Makadia

Published 2022-07-21Version 1

Neural rendering has received tremendous attention since the advent of Neural Radiance Fields (NeRF), and has pushed the state-of-the-art on novel-view synthesis considerably. The recent focus has been on models that overfit to a single scene, and the few attempts to learn models that can synthesize novel views of unseen scenes mostly consist of combining deep convolutional features with a NeRF-like model. We propose a different paradigm, where no deep features and no NeRF-like volume rendering are needed. Our method is capable of predicting the color of a target ray in a novel scene directly, just from a collection of patches sampled from the scene. We first leverage epipolar geometry to extract patches along the epipolar lines of each reference view. Each patch is linearly projected into a 1D feature vector and a sequence of transformers process the collection. For positional encoding, we parameterize rays as in a light field representation, with the crucial difference that the coordinates are canonicalized with respect to the target ray, which makes our method independent of the reference frame and improves generalization. We show that our approach outperforms the state-of-the-art on novel view synthesis of unseen scenes even when being trained with considerably less data than prior work.

Comments: Project Page with code and results at https://mohammedsuhail.net/gen_patch_neural_rendering/
Categories: cs.CV
Related articles: Most relevant | Search more
arXiv:2303.11963 [cs.CV] (Published 2023-03-21)
NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects
arXiv:1412.0003 [cs.CV] (Published 2014-11-26)
3D-Assisted Image Feature Synthesis for Novel Views of an Object
arXiv:2407.21450 [cs.CV] (Published 2024-07-31)
Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation