{
  "id": "2304.08486",
  "version": "v1",
  "published": "2023-04-17T17:59:26.000Z",
  "updated": "2023-04-17T17:59:26.000Z",
  "title": "BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors",
  "authors": [
    "Kathryn Wantlin",
    "Chenwei Wu",
    "Shih-Cheng Huang",
    "Oishi Banerjee",
    "Farah Dadabhoy",
    "Veeral Vipin Mehta",
    "Ryan Wonhee Han",
    "Fang Cao",
    "Raja R. Narayan",
    "Errol Colak",
    "Adewole Adamson",
    "Laura Heacock",
    "Geoffrey H. Tison",
    "Alex Tamkin",
    "Pranav Rajpurkar"
  ],
  "categories": [
    "cs.CV"
  ],
  "abstract": "Medical data poses a daunting challenge for AI algorithms: it exists in many different modalities, experiences frequent distribution shifts, and suffers from a scarcity of examples and labels. Recent advances, including transformers and self-supervised learning, promise a more universal approach that can be applied flexibly across these diverse conditions. To measure and drive progress in this direction, we present BenchMD: a benchmark that tests how modality-agnostic methods, including architectures and training techniques (e.g. self-supervised learning, ImageNet pretraining), perform on a diverse array of clinically-relevant medical tasks. BenchMD combines 19 publicly available datasets for 7 medical modalities, including 1D sensor data, 2D images, and 3D volumetric scans. Our benchmark reflects real-world data constraints by evaluating methods across a range of dataset sizes, including challenging few-shot settings that incentivize the use of pretraining. Finally, we evaluate performance on out-of-distribution data collected at different hospitals than the training data, representing naturally-occurring distribution shifts that frequently degrade the performance of medical AI models. Our baseline results demonstrate that no modality-agnostic technique achieves strong performance across all modalities, leaving ample room for improvement on the benchmark. Code is released at https://github.com/rajpurkarlab/BenchMD .",
  "revisions": [
    {
      "version": "v1",
      "updated": "2023-04-17T17:59:26.000Z"
    }
  ],
  "analyses": {
    "keywords": [
      "medical images",
      "modality-agnostic learning",
      "benchmark reflects real-world data constraints",
      "modality-agnostic technique achieves strong performance",
      "experiences frequent distribution shifts"
    ],
    "note": {
      "typesetting": "TeX",
      "pages": 0,
      "language": "en",
      "license": "arXiv",
      "status": "editable"
    }
  }
}