arXiv Analytics

Sign in

arXiv:1607.06534 [stat.ML]AbstractReferencesReviewsResources

The Landscape of Empirical Risk for Non-convex Losses

Song Mei, Yu Bai, Andrea Montanari

Published 2016-07-22Version 1

We revisit the problem of learning a noisy linear classifier by minimizing the empirical risk associated to the square loss. While the empirical risk is non-convex, we prove that its structure is remarkably simple. Namely, when the sample size is larger than $C \, d\log d$ (with $d$ the dimension, and $C$ a constant) the following happen with high probability: $(a)$ The empirical risk has a unique local minimum (which is also the global minimum); $(b)$ Gradient descent converges exponentially fast to the global minimizer, from any initialization; $(c)$ The global minimizer approaches the true parameter at nearly optimal rate. The core of our argument is to establish a uniform convergence result for the gradients and Hessians of the empirical risk.

Related articles: Most relevant | Search more
arXiv:1912.02143 [stat.ML] (Published 2019-12-04)
Landscape Complexity for the Empirical Risk of Generalized Linear Models
arXiv:2502.01953 [stat.ML] (Published 2025-02-04)
Local minima of the empirical risk in high dimension: General theorems and convex examples
arXiv:1705.07038 [stat.ML] (Published 2017-05-19)
The Landscape of Deep Learning Algorithms