arXiv:2311.18531 Abstract | arXiv Analytics

arXiv:2311.18531 [cs.CV]Abstract References Reviews Resources

Dataset Distillation via the Wasserstein Metric

Haoyang Liu, Tiancheng Xing, Luwei Li, Vibhu Dalal, Jingrui He, Haohan Wang

Published 2023-11-30Version 1

Dataset distillation (DD) offers a compelling approach in computer vision, with the goal of condensing extensive datasets into smaller synthetic versions without sacrificing much of the model performance. In this paper, we continue to study the methods for DD, by addressing its conceptually core objective: how to capture the essential representation of extensive datasets in smaller, synthetic forms. We propose a novel approach utilizing the Wasserstein distance, a metric rooted in optimal transport theory, to enhance distribution matching in DD. Our method leverages the Wasserstein barycenter, offering a geometrically meaningful way to quantify distribution differences and effectively capture the centroid of a set of distributions. Our approach retains the computational benefits of distribution matching-based methods while achieving new state-of-the-art performance on several benchmarks. To provide useful prior for learning the images, we embed the synthetic data into the feature space of pretrained classification models to conduct distribution matching. Extensive testing on various high-resolution datasets confirms the effectiveness and adaptability of our method, indicating the promising yet unexplored capabilities of Wasserstein metrics in dataset distillation.

Comments: 8 pages, 10 figures

Categories: cs.CV, cs.AI, cs.LG

Keywords: dataset distillation, wasserstein metric, distribution, extensive datasets, smaller synthetic versions

Related articles: Most relevant | Search more

arXiv:2203.11932 [cs.CV] (Published 2022-03-22)

Dataset Distillation by Matching Training Trajectories

George Cazenavette, Tongzhou Wang, Antonio Torralba, Alexei A. Efros, Jun-Yan Zhu

arXiv:2210.16774 [cs.CV] (Published 2022-10-30)

Dataset Distillation via Factorization

Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang

arXiv:2007.13010 [cs.CV] (Published 2020-07-25)

Style is a Distribution of Features