{ "id": "2304.08486", "version": "v1", "published": "2023-04-17T17:59:26.000Z", "updated": "2023-04-17T17:59:26.000Z", "title": "BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors", "authors": [ "Kathryn Wantlin", "Chenwei Wu", "Shih-Cheng Huang", "Oishi Banerjee", "Farah Dadabhoy", "Veeral Vipin Mehta", "Ryan Wonhee Han", "Fang Cao", "Raja R. Narayan", "Errol Colak", "Adewole Adamson", "Laura Heacock", "Geoffrey H. Tison", "Alex Tamkin", "Pranav Rajpurkar" ], "categories": [ "cs.CV" ], "abstract": "Medical data poses a daunting challenge for AI algorithms: it exists in many different modalities, experiences frequent distribution shifts, and suffers from a scarcity of examples and labels. Recent advances, including transformers and self-supervised learning, promise a more universal approach that can be applied flexibly across these diverse conditions. To measure and drive progress in this direction, we present BenchMD: a benchmark that tests how modality-agnostic methods, including architectures and training techniques (e.g. self-supervised learning, ImageNet pretraining), perform on a diverse array of clinically-relevant medical tasks. BenchMD combines 19 publicly available datasets for 7 medical modalities, including 1D sensor data, 2D images, and 3D volumetric scans. Our benchmark reflects real-world data constraints by evaluating methods across a range of dataset sizes, including challenging few-shot settings that incentivize the use of pretraining. Finally, we evaluate performance on out-of-distribution data collected at different hospitals than the training data, representing naturally-occurring distribution shifts that frequently degrade the performance of medical AI models. Our baseline results demonstrate that no modality-agnostic technique achieves strong performance across all modalities, leaving ample room for improvement on the benchmark. Code is released at https://github.com/rajpurkarlab/BenchMD .", "revisions": [ { "version": "v1", "updated": "2023-04-17T17:59:26.000Z" } ], "analyses": { "keywords": [ "medical images", "modality-agnostic learning", "benchmark reflects real-world data constraints", "modality-agnostic technique achieves strong performance", "experiences frequent distribution shifts" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }