arXiv Analytics

Sign in

arXiv:1905.13742 [stat.ML]AbstractReferencesReviewsResources

High Dimensional Classification via Empirical Risk Minimization: Improvements and Optimality

Xiaoyi Mai, Zhenyu Liao

Published 2019-05-31Version 1

In this article, we investigate a family of classification algorithms defined by the principle of empirical risk minimization, in the high dimensional regime where the feature dimension $p$ and data number $n$ are both large and comparable. Based on recent advances in high dimensional statistics and random matrix theory, we provide under mixture data model a unified stochastic characterization of classifiers learned with different loss functions. Our results are instrumental to an in-depth understanding as well as practical improvements on this fundamental classification approach. As the main outcome, we demonstrate the existence of a universally optimal loss function which yields the best high dimensional performance at any given $n/p$ ratio.

Related articles: Most relevant | Search more
arXiv:1609.01872 [stat.ML] (Published 2016-09-07)
Chaining Bounds for Empirical Risk Minimization
arXiv:1802.08626 [stat.ML] (Published 2018-02-23)
Empirical Risk Minimization under Fairness Constraints
arXiv:1806.10701 [stat.ML] (Published 2018-06-27)
Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data