{ "id": "2403.15705", "version": "v1", "published": "2024-03-23T03:56:25.000Z", "updated": "2024-03-23T03:56:25.000Z", "title": "UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation", "authors": [ "Yuliang Guo", "Abhinav Kumar", "Cheng Zhao", "Ruoyu Wang", "Xinyu Huang", "Liu Ren" ], "categories": [ "cs.CV" ], "abstract": "Monocular 3D reconstruction for categorical objects heavily relies on accurately perceiving each object's pose. While gradient-based optimization within a NeRF framework updates initially given poses, this paper highlights that such a scheme fails when the initial pose even moderately deviates from the true pose. Consequently, existing methods often depend on a third-party 3D object to provide an initial object pose, leading to increased complexity and generalization issues. To address these challenges, we present UPNeRF, a Unified framework integrating Pose estimation and NeRF-based reconstruction, bringing us closer to real-time monocular 3D object reconstruction. UPNeRF decouples the object's dimension estimation and pose refinement to resolve the scale-depth ambiguity, and introduces an effective projected-box representation that generalizes well cross different domains. While using a dedicated pose estimator that smoothly integrates into an object-centric NeRF, UPNeRF is free from external 3D detectors. UPNeRF achieves state-of-the-art results in both reconstruction and pose estimation tasks on the nuScenes dataset. Furthermore, UPNeRF exhibits exceptional Cross-dataset generalization on the KITTI and Waymo datasets, surpassing prior methods with up to 50% reduction in rotation and translation error.", "revisions": [ { "version": "v1", "updated": "2024-03-23T03:56:25.000Z" } ], "analyses": { "keywords": [ "unified framework", "framework integrating pose estimation", "real-time monocular 3d object reconstruction", "upnerf achieves state-of-the-art results", "external 3d detectors" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }