arXiv:2206.05050 Abstract | arXiv Analytics

arXiv:2206.05050 [cs.LG]Abstract References Reviews Resources

Improved Approximation for Fair Correlation Clustering

Published 2022-06-09Version 1

Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge. Motivated by this, we study Fair Correlation Clustering where the data points may belong to different protected groups and the goal is to ensure fair representation of all groups across clusters. Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadi et al. and Ahmadian et al. as follows. - We allow the user to specify an arbitrary upper bound on the representation of each group in a cluster. - Our algorithm allows individuals to have multiple protected features and ensure fairness simultaneously across them all. - We prove guarantees for clustering quality and fairness in this general setting. Furthermore, this improves on the results for the special cases studied in previous work. Our experiments on real-world data demonstrate that our clustering quality compared to the optimal solution is much better than what our theoretical result suggests.

Categories: cs.LG, cs.DS

Keywords: approximation, real-world data demonstrate, arbitrary upper bound, ensure fair representation, clustering quality

Related articles: Most relevant | Search more

arXiv:cs/0612095 [cs.LG] (Published 2006-12-19, updated 2008-09-15)

Approximation of the Two-Part MDL Code

Pieter Adriaans, Paul Vitanyi

arXiv:2301.02896 [cs.LG] (Published 2023-01-07)

k-Means SubClustering: A Differentially Private Algorithm with Improved Clustering Quality

Devvrat Joshi, Janvi Thakkar

arXiv:2009.02400 [cs.LG] (Published 2020-09-04)

The Area Under the ROC Curve as a Measure of Clustering Quality

Pablo Andretta Jaskowiak, Ivan Gesteira Costa, Ricardo José Gabrielli Barreto Campello