arXiv:2310.14826 Abstract | arXiv Analytics

arXiv:2310.14826 [stat.ML]Abstract References Reviews Resources

Sharp error bounds for imbalanced classification: how many examples in the minority class?

Anass Aghbalou, François Portier, Anne Sabourin

Published 2023-10-23Version 1

When dealing with imbalanced classification data, reweighting the loss function is a standard procedure allowing to equilibrate between the true positive and true negative rates within the risk measure. Despite significant theoretical work in this area, existing results do not adequately address a main challenge within the imbalanced classification framework, which is the negligible size of one class in relation to the full sample size and the need to rescale the risk function by a probability tending to zero. To address this gap, we present two novel contributions in the setting where the rare class probability approaches zero: (1) a non asymptotic fast rate probability bound for constrained balanced empirical risk minimization, and (2) a consistent upper bound for balanced nearest neighbors estimates. Our findings provide a clearer understanding of the benefits of class-weighting in realistic settings, opening new avenues for further research in this field.

Categories: stat.ML, cs.LG

Keywords: sharp error bounds, imbalanced classification, minority class, balanced empirical risk minimization, asymptotic fast rate probability bound

Related articles: Most relevant | Search more

arXiv:1707.03905 [stat.ML] (Published 2017-07-12)

Influence of Resampling on Accuracy of Imbalanced Classification

Evgeny Burnaev, Pavel Erofeev, Artem Papanov

arXiv:2501.04903 [stat.ML] (Published 2025-01-09)

Towards understanding the bias in decision trees

Nathan Phelps, Daniel J. Lizotte, Douglas G. Woolford

arXiv:2409.05598 [stat.ML] (Published 2024-09-09)

When resampling/reweighting improves feature learning in imbalanced classification?: A toy-model study

Tomoyuki Obuchi, Toshiyuki Tanaka