arXiv:2010.06000 Abstract | arXiv Analytics

arXiv:2010.06000 [cs.CV]Abstract References Reviews Resources

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi

Published 2020-10-12Version 1

Understanding the relationship between figures and text is key to scientific document understanding. Medical figures in particular are quite complex, often consisting of several subfigures (75% of figures in our dataset), with detailed text describing their content. Previous work studying figures in scientific papers focused on classifying figure content rather than understanding how images relate to the text. To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context. MedICaT consists of 217K images from 131K open access biomedical papers, and includes captions, inline references for 74% of figures, and manually annotated subfigures and subcaptions for a subset of figures. Using MedICaT, we introduce the task of subfigure to subcaption alignment in compound figures and demonstrate the utility of inline references in image-text matching. Our data and code can be accessed at https://github.com/allenai/medicat.

Comments: EMNLP-Findings 2020

Categories: cs.CV, cs.CL

Keywords: medical images, textual references, inline references, 131k open access biomedical papers, work studying figures

Related articles: Most relevant | Search more

arXiv:2403.12570 [cs.CV] (Published 2024-03-19)

Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya Zhang, Xinchao Wang, Yanfeng Wang

arXiv:2307.08919 [cs.CV] (Published 2023-07-18)

Accuracy versus time frontiers of semi-supervised and self-supervised learning on medical images

Zhe Huang, Ruijie Jiang, Shuchin Aeron, Michael C. Hughes

arXiv:1507.01251 [cs.CV] (Published 2015-07-05)

Autoencoding the Retrieval Relevance of Medical Images