arXiv Analytics

Sign in

arXiv:1804.04259 [cs.CV]AbstractReferencesReviewsResources

Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation

Zhaoyang Lv, Kihwan Kim, Alejandro Troccoli, Deqing Sun, James M. Rehg, Jan Kautz

Published 2018-04-12Version 1

Estimation of 3D motion in a dynamic scene from a temporal pair of images is a core task in many scene understanding problems. In real world applications, a dynamic scene is commonly captured by a moving camera (i.e., panning, tilting or hand-held), increasing the task complexity because the scene is observed from different view points. The main challenge is the disambiguation of the camera motion from scene motion, which becomes more difficult as the amount of rigidity observed decreases, even with successful estimation of 2D image correspondences. Compared to other state-of-the-art 3D scene flow estimation methods, in this paper we propose to \emph{learn} the rigidity of a scene in a supervised manner from a large collection of dynamic scene data, and directly infer a rigidity mask from two sequential images with depths. With the learned network, we show how we can effectively estimate camera motion and projected scene flow using computed 2D optical flow and the inferred rigidity mask. For training and testing the rigidity network, we also provide a new semi-synthetic dynamic scene dataset (synthetic foreground objects with a real background) and an evaluation split that accounts for the percentage of observed non-rigid pixels. Through our evaluation we show the proposed framework outperforms current state-of-the-art scene flow estimation methods in challenging dynamic scenes.

Related articles: Most relevant | Search more
arXiv:2001.05238 [cs.CV] (Published 2020-01-15)
Moving Objects Detection with a Moving Camera: A Comprehensive Review
arXiv:2412.14705 [cs.CV] (Published 2024-12-19)
Event-assisted 12-stop HDR Imaging of Dynamic Scene
arXiv:2404.05163 [cs.CV] (Published 2024-04-08)
Semantic Flow: Learning Semantic Field of Dynamic Scenes from Monocular Videos