arXiv:2001.06892 Abstract | arXiv Analytics

arXiv:2001.06892 [stat.ML]Abstract References Reviews Resources

Optimal Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting

Published 2020-01-19Version 1

Classifiers built with neural networks handle large-scale high-dimensional data, such as facial images from computer vision, extremely well while traditional statistical methods often fail miserably. In this paper, we attempt to understand this empirical success in high dimensional classification by deriving the convergence rates of excess risk. In particular, a teacher-student framework is proposed that assumes the Bayes classifier to be expressed as ReLU neural networks. In this setup, we obtain a dimension-independent and un-improvable rate of convergence, i.e., $O(n^{-2/3})$, for classifiers trained based on either 0-1 loss or hinge loss. This rate can be further improved to $O(n^{-1})$ when data is separable. Here, $n$ represents the sample size.

Categories: stat.ML, cs.LG

Keywords: deep neural network classifiers, optimal rate, teacher-student setting, convergence, neural networks handle large-scale high-dimensional

Related articles: Most relevant | Search more

arXiv:2409.18804 [stat.ML] (Published 2024-09-27)

Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions

Iskander Azangulov, George Deligiannidis, Judith Rousseau

arXiv:2209.02305 [stat.ML] (Published 2022-09-06)

Rates of Convergence for Regression with the Graph Poly-Laplacian

Nicolás García Trillos, Ryan Murray, Matthew Thorpe

arXiv:2306.01122 [stat.ML] (Published 2023-06-01)

On the Convergence of Coordinate Ascent Variational Inference

Anirban Bhattacharya, Debdeep Pati, Yun Yang