arXiv Analytics

Sign in

arXiv:1811.10770 [cs.CV]AbstractReferencesReviewsResources

Generating Attention from Classifier Activations for Fine-grained Recognition

Wei Shen, Rujie Liu

Published 2018-11-27Version 1

Recent advances in fine-grained recognition utilize attention maps to localize objects of interest. Although there are many ways to generate attention maps, most of them rely on sophisticated loss functions or complex training processes. In this work, we propose a simple and straightforward attention generation model based on the output activations of classifiers. The advantage of our model is that it can be easily trained with image level labels and softmax loss functions. More specifically, multiple linear local classifiers are firstly adopted to perform fine-grained classification at each location of high level CNN feature maps. The attention map is generated by aggregating and max-pooling the output activations. Then the attention map serves as a surrogate target object mask to train those local classifiers, similar to training models for semantic segmentation. Our model achieves state-of-the-art results on three heavily benchmarked datasets, i.e. 87.9% on CUB-200-2011 dataset, 94.1% on Stanford Cars dataset and 92.1% on FGVC-Aircraft dataset, demonstrating its effectiveness on fine-grained recognition tasks.

Related articles: Most relevant | Search more
arXiv:1703.09983 [cs.CV] (Published 2017-03-29)
Iterative Object and Part Transfer for Fine-Grained Recognition
arXiv:1909.08950 [cs.CV] (Published 2019-09-19)
Count, Crop and Recognise: Fine-Grained Recognition in the Wild
arXiv:2203.14215 [cs.CV] (Published 2022-03-27)
Knowledge Mining with Scene Text for Fine-Grained Recognition
Hao Wang et al.