arXiv:2003.08429 Abstract | arXiv Analytics

arXiv:2003.08429 [cs.CV]Abstract References Reviews Resources

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos

Ali Athar, Sabarinath Mahadevan, Aljoša Ošep, Laura Leal-Taixé, Bastian Leibe

Published 2020-03-18Version 1

Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in individual frames, and then associate these detections over time. Hence, these methods are often non-end-to-end trainable and highly tailored to specific tasks. In this paper, we propose a different approach that is well-suited to a variety of tasks involving instance segmentation in videos. In particular, we model a video clip as a single 3D spatio-temporal volume, and propose a novel approach that segments and tracks instances across space and time in a single stage. Our problem formulation is centered around the idea of spatio-temporal embeddings which are trained to cluster pixels belonging to a specific object instance over an entire video clip. To this end, we introduce (i) novel mixing functions that enhance the feature representation of spatio-temporal embeddings, and (ii) a single-stage, proposal-free network that can reason about temporal context. Our network is trained end-to-end to learn spatio-temporal embeddings as well as parameters required to cluster these embeddings, thus simplifying inference. Our method achieves state-of-the-art results across multiple datasets and tasks.

Comments: 28 pages, 6 figures

Categories: cs.CV, cs.LG, eess.IV

Subjects: 68T45, 68T10, 62H30, I.4.6, I.4.8, I.5.3

Keywords: instance segmentation, single 3d spatio-temporal volume, method achieves state-of-the-art results, learn spatio-temporal embeddings, entire video clip

Related articles: Most relevant | Search more

arXiv:1904.13273 [cs.CV] (Published 2019-04-30)

Detecting Reflections by Combining Semantic and Instance Segmentation

David Owen, Ping-Lin Chang

arXiv:1802.07465 [cs.CV] (Published 2018-02-21)

Multiclass Weighted Loss for Instance Segmentation of Cluttered Cells

Fidel A. Guerrero-Pena, Pedro D. Marrero Fernandez, Tsang Ing Ren, Mary Yui, Ellen Rothenberg, Alexandre Cunha

arXiv:2004.13665 [cs.CV] (Published 2020-04-28)