arXiv Analytics

Sign in

arXiv:2004.10605 [cs.CV]AbstractReferencesReviewsResources

Self-Supervised Representation Learning on Document Images

Adrian Cosma, Mihai Ghidoveanu, Michael Panaitescu-Liess, Marius Popescu

Published 2020-04-18Version 1

This work analyses the impact of self-supervised pre-training on document images. While previous approaches explore the effect of self-supervision on natural images, we show that patch-based pre-training performs poorly on text document images because of their different structural properties and poor intra-sample semantic information. We propose two context-aware alternatives to improve performance. We also propose a novel method for self-supervision, which makes use of the inherent multi-modality of documents (image and text), which performs better than other popular self-supervised methods, including supervised ImageNet pre-training.

Comments: 15 pages, 5 figures. Accepted at DAS 2020: IAPR International Workshop on Document Analysis Systems
Categories: cs.CV, cs.LG, eess.IV, stat.ML
Subjects: 68T05
Related articles: Most relevant | Search more
arXiv:2003.00105 [cs.CV] (Published 2020-02-28)
Self-supervised Representation Learning for Ultrasound Video
arXiv:2101.06553 [cs.CV] (Published 2021-01-16)
Self-Supervised Representation Learning from Flow Equivariance
arXiv:2009.07994 [cs.CV] (Published 2020-09-17)
AAG: Self-Supervised Representation Learning by Auxiliary Augmentation with GNT-Xent Loss