arXiv:2205.15173 Abstract | arXiv Analytics

arXiv:2205.15173 [cs.CV]Abstract References Reviews Resources

Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks

Jaonary Rabarisoa, Velentin Belissen, Florian Chabot, Quoc-Cuong Pham

Published 2022-05-30Version 1

We present a new self-supervised pre-training of Vision Transformers for dense prediction tasks. It is based on a contrastive loss across views that compares pixel-level representations to global image representations. This strategy produces better local features suitable for dense prediction tasks as opposed to contrastive pre-training based on global image representation only. Furthermore, our approach does not suffer from a reduced batch size since the number of negative examples needed in the contrastive loss is in the order of the number of local features. We demonstrate the effectiveness of our pre-training strategy on two dense prediction tasks: semantic segmentation and monocular depth estimation.

Categories: cs.CV

Keywords: dense prediction tasks, vision transformers, self-supervised pre-training, better local features suitable, global image representation

Related articles: Most relevant | Search more

arXiv:2107.04735 [cs.CV] (Published 2021-07-10)

Local-to-Global Self-Attention in Vision Transformers

Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao

arXiv:2106.15788 [cs.CV] (Published 2021-06-30)

Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment

Di Wu et al.

arXiv:2203.11894 [cs.CV] (Published 2022-03-22)

GradViT: Gradient Inversion of Vision Transformers

Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

arXiv Analytics

arXiv:2205.15173 [cs.CV]Abstract References Reviews Resources

Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks

Links

Toolbox

arXiv:2205.15173 [cs.CV]AbstractReferencesReviewsResources

Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks

Links

Toolbox

arXiv:2205.15173 [cs.CV]Abstract References Reviews Resources