arXiv:1905.13742 Abstract | arXiv Analytics

arXiv:1905.13742 [stat.ML]Abstract References Reviews Resources

High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Published 2019-05-31Version 1

In this article, we investigate a family of classification algorithms defined by the principle of empirical risk minimization, in the high dimensional regime where the feature dimension $p$ and data number $n$ are both large and comparable. Based on recent advances in high dimensional statistics and random matrix theory, we provide under mixture data model a unified stochastic characterization of classifiers learned with different loss functions. Our results are instrumental to an in-depth understanding as well as practical improvements on this fundamental classification approach. As the main outcome, we demonstrate the existence of a universally optimal loss function which yields the best high dimensional performance at any given $n/p$ ratio.

Categories: stat.ML, cs.LG

Keywords: empirical risk minimization, high dimensional classification, improvements, optimality, best high dimensional performance

Related articles: Most relevant | Search more

arXiv:1609.01872 [stat.ML] (Published 2016-09-07)

Chaining Bounds for Empirical Risk Minimization

Gábor Balázs, András György, Csaba Szepesvári

arXiv:1802.08626 [stat.ML] (Published 2018-02-23)

Empirical Risk Minimization under Fairness Constraints

Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, Massimiliano Pontil

arXiv:1806.10701 [stat.ML] (Published 2018-06-27)

Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data

Victor Veitch, Morgane Austern, Wenda Zhou, David M. Blei, Peter Orbanz