{ "id": "2203.14215", "version": "v1", "published": "2022-03-27T05:54:00.000Z", "updated": "2022-03-27T05:54:00.000Z", "title": "Knowledge Mining with Scene Text for Fine-Grained Recognition", "authors": [ "Hao Wang", "Junchao Liao", "Tianheng Cheng", "Zewen Gao", "Hao Liu", "Bo Ren", "Xiang Bai", "Wenyu Liu" ], "comment": "Accepted to CVPR 2022. The source code and new dataset of this work are available at https://github.com/lanfeng4659/KnowledgeMiningWithSceneText", "categories": [ "cs.CV" ], "abstract": "Recently, the semantics of scene text has been proven to be essential in fine-grained image classification. However, the existing methods mainly exploit the literal meaning of scene text for fine-grained recognition, which might be irrelevant when it is not significantly related to objects/scenes. We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image and enhance the semantics and correlation to fine-tune the image representation. Unlike the existing methods, our model integrates three modalities: visual feature extraction, text semantics extraction, and correlating background knowledge to fine-grained image classification. Specifically, we employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification. Experiments on two benchmark datasets, Con-Text, and Drink Bottle, show that our method outperforms the state-of-the-art by 3.72\\% mAP and 5.39\\% mAP, respectively. To further validate the effectiveness of the proposed method, we create a new dataset on crowd activity recognition for the evaluation. The source code and new dataset of this work are available at https://github.com/lanfeng4659/KnowledgeMiningWithSceneText.", "revisions": [ { "version": "v1", "updated": "2022-03-27T05:54:00.000Z" } ], "analyses": { "keywords": [ "fine-grained recognition", "knowledge mining", "fine-grained image classification", "mines implicit contextual knowledge", "text semantics extraction" ], "tags": [ "github project" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }