arXiv:2309.07110 Abstract | arXiv Analytics

arXiv:2309.07110 [stat.ML]Abstract References Reviews Resources

Data Augmentation via Subgroup Mixup for Improving Fairness

Madeline Navarro, Camille Little, Genevera I. Allen, Santiago Segarra

Published 2023-09-13Version 1

In this work, we propose data augmentation via pairwise mixup across subgroups to improve group fairness. Many real-world applications of machine learning systems exhibit biases across certain groups due to under-representation or training data that reflects societal biases. Inspired by the successes of mixup for improving classification performance, we develop a pairwise mixup scheme to augment training data and encourage fair and accurate decision boundaries for all subgroups. Data augmentation for group fairness allows us to add new samples of underrepresented groups to balance subpopulations. Furthermore, our method allows us to use the generalization ability of mixup to improve both fairness and accuracy. We compare our proposed mixup to existing data augmentation and bias mitigation approaches on both synthetic simulations and real-world benchmark fair classification data, demonstrating that we are able to achieve fair outcomes with robust if not improved accuracy.

Comments: 5 pages, 2 figures, 1 table

Categories: stat.ML, cs.LG

Keywords: data augmentation, subgroup mixup, improving fairness, real-world benchmark fair classification data, group fairness

Related articles: Most relevant | Search more

arXiv:2006.04960 [stat.ML] (Published 2020-06-08)

A Notion of Individual Fairness for Clustering

Matthäus Kleindessner, Pranjal Awasthi, Jamie Morgenstern

arXiv:2309.07453 [stat.ML] (Published 2023-09-14)

SC-MAD: Mixtures of Higher-order Networks for Data Augmentation

Madeline Navarro, Santiago Segarra

arXiv:2205.09906 [stat.ML] (Published 2022-05-20)

Data Augmentation for Compositional Data: Advancing Predictive Models of the Microbiome