arXiv:2409.00908 Abstract | arXiv Analytics

arXiv:2409.00908 [stat.ML]Abstract References Reviews Resources

EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification

Published 2024-09-02Version 1

Empirical risk minimization (ERM) with a computationally feasible surrogate loss is a widely accepted approach for classification. Notably, the convexity and calibration (CC) properties of a loss function ensure consistency of ERM in maximizing accuracy, thereby offering a wide range of options for surrogate losses. In this article, we propose a novel ensemble method, namely \textsc{EnsLoss}, which extends the ensemble learning concept to combine loss functions within the ERM framework. A key feature of our method is the consideration on preserving the ``legitimacy'' of the combined losses, i.e., ensuring the CC properties. Specifically, we first transform the CC conditions of losses into loss-derivatives, thereby bypassing the need for explicit loss functions and directly generating calibrated loss-derivatives. Therefore, inspired by Dropout, \textsc{EnsLoss} enables loss ensembles through one training process with doubly stochastic gradient descent (i.e., random batch samples and random calibrated loss-derivatives). We theoretically establish the statistical consistency of our approach and provide insights into its benefits. The numerical effectiveness of \textsc{EnsLoss} compared to fixed loss methods is demonstrated through experiments on a broad range of 14 OpenML tabular datasets and 46 image datasets with various deep learning architectures. Python repository and source code are available on \textsc{GitHub} at \url{https://github.com/statmlben/rankseg}.

Comments: 31 pages; 4 figures

Categories: stat.ML, cs.LG, stat.ME

Keywords: stochastic calibrated loss ensembles, classification, preventing overfitting, loss function ensure consistency, surrogate loss

Related articles: Most relevant | Search more

arXiv:1803.00276 [stat.ML] (Published 2018-03-01)

Model-Based Clustering and Classification of Functional Data

Faicel Chamroukhi, Hien D. Nguyen

arXiv:1808.03064 [stat.ML] (Published 2018-08-09)

Gradient and Newton Boosting for Classification and Regression

Fabio Sigrist

arXiv:1805.00811 [stat.ML] (Published 2018-05-02)

An Evaluation of Classification and Outlier Detection Algorithms

Victoria J. Hodge, Jim Austin

arXiv Analytics

arXiv:2409.00908 [stat.ML]Abstract References Reviews Resources

EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification

Links

Toolbox

arXiv:2409.00908 [stat.ML]AbstractReferencesReviewsResources

EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification

Links

Toolbox

arXiv:2409.00908 [stat.ML]Abstract References Reviews Resources