arXiv:1711.01732 Abstract | arXiv Analytics

arXiv:1711.01732 [cs.CV]Abstract References Reviews Resources

Active Learning for Visual Question Answering: An Empirical Study

Published 2017-11-06Version 1

We present an empirical study of active learning for Visual Question Answering, where a deep VQA model selects informative question-image pairs from a pool and queries an oracle for answers to maximally improve its performance under a limited query budget. Drawing analogies from human learning, we explore cramming (entropy), curiosity-driven (expected model change), and goal-driven (expected error reduction) active learning approaches, and propose a fast and effective goal-driven active learning scoring function to pick question-image pairs for deep VQA models under the Bayesian Neural Network framework. We find that deep VQA models need large amounts of training data before they can start asking informative questions. But once they do, all three approaches outperform the random selection baseline and achieve significant query savings. For the scenario where the model is allowed to ask generic questions about images but is evaluated only on specific questions (e.g., questions whose answer is either yes or no), our proposed goal-driven scoring function performs the best.

Categories: cs.CV

Keywords: active learning, visual question answering, empirical study, deep vqa model, selects informative question-image pairs

Related articles: Most relevant | Search more

arXiv:1907.12133 [cs.CV] (Published 2019-07-28)

An Empirical Study on Leveraging Scene Graphs for Visual Question Answering

Cheng Zhang, Wei-Lun Chao, Dong Xuan

arXiv:2007.06364 [cs.CV] (Published 2020-07-13)

On uncertainty estimation in active learning for image segmentation

Bo Li, Tommy Sonne Alstrøm

arXiv:2002.03752 [cs.CV] (Published 2020-01-25)

An Empirical Study of Person Re-Identification with Attributes

Vikram Shree, Wei-Lun Chao, Mark Campbell