arXiv Analytics

Sign in

arXiv:2309.13337 [cs.LG]AbstractReferencesReviewsResources

On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Yicheng Li, Haobo Zhang, Qian Lin

Published 2023-09-23Version 1

The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning curve) of kernel ridge regression attracts increasing attention recently. However, most recent arguments on the learning curve are heuristic and are based on the 'Gaussian design' assumption. In this paper, under mild and more realistic assumptions, we rigorously provide a full characterization of the learning curve: elaborating the effect and the interplay of the choice of the regularization parameter, the source condition and the noise. In particular, our results suggest that the 'benign overfitting phenomenon' exists in very wide neural networks only when the noise level is small.

Related articles: Most relevant | Search more
arXiv:2312.05885 [cs.LG] (Published 2023-12-10)
Adaptive Parameter Selection for Kernel Ridge Regression
arXiv:2410.17796 [cs.LG] (Published 2024-10-23)
A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression
arXiv:1406.2622 [cs.LG] (Published 2014-06-10)
Equivalence of Learning Algorithms