arXiv Analytics

Sign in

arXiv:2103.06638 [cs.CV]AbstractReferencesReviewsResources

Generalized Contrastive Optimization of Siamese Networks for Place Recognition

María Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov

Published 2021-03-11Version 1

Visual place recognition is a challenging task in computer vision and a key component of camera-based localization and navigation systems. Recently, Convolutional Neural Networks (CNNs) achieved high results and good generalization capabilities. They are usually trained using pairs or triplets of images labeled as either similar or dissimilar, in a binary fashion. In practice, the similarity between two images is not binary, but rather continuous. Furthermore, training these CNNs is computationally complex and involves costly pair and triplet mining strategies. We propose a Generalized Contrastive loss (GCL) function that relies on image similarity as a continuous measure, and use it to train a siamese CNN. Furthermore, we propose three techniques for automatic annotation of image pairs with labels indicating their degree of similarity, and deploy them to re-annotate the MSLS, TB-Places, and 7Scenes datasets. We demonstrate that siamese CNNs trained using the GCL function and the improved annotations consistently outperform their binary counterparts. Our models trained on MSLS outperform the state-of-the-art methods, including NetVLAD, and generalize well on the Pittsburgh, TokyoTM and Tokyo 24/7 datasets. Furthermore, training a siamese network using the GCL function does not require complex pair mining. We release the source code at https://github.com/marialeyvallina/generalized_contrastive_loss.

Related articles: Most relevant | Search more
arXiv:2503.09749 [cs.CV] (Published 2025-03-12, updated 2025-06-25)
A Siamese Network to Detect If Two Iris Images Are Monozygotic
arXiv:1905.00401 [cs.CV] (Published 2019-05-01)
Learn Stereo, Infer Mono: Siamese Networks for Self-Supervised, Monocular, Depth Estimation
arXiv:1911.04237 [cs.CV] (Published 2019-11-11)
PoshakNet: Framework for matching dresses from real-life photos using GAN and Siamese Network