arXiv:1909.12673 Abstract | arXiv Analytics

arXiv:1909.12673 [cs.LG]Abstract References Reviews Resources

A Constructive Prediction of the Generalization Error Across Scales

Jonathan S. Rosenfeld, Amir Rosenfeld, Yonatan Belinkov, Nir Shavit

Published 2019-09-27Version 1

The dependency of the generalization error of neural networks on model and dataset size is of critical importance both in practice and for understanding the theory of neural networks. Nevertheless, the functional form of this dependency remains elusive. In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of model scaling (e.g., width, depth), we are able to simultaneously construct such a form and specify the exact models which can attain it across model/data scales. Our construction follows insights obtained from observations conducted over a range of model/data scales, in various model types and datasets, in vision and language tasks. We show that the form both fits the observations well across scales, and provides accurate predictions from small- to large-scale models and data.

Categories: cs.LG, cs.CL, cs.CV, stat.ML

Keywords: generalization error, constructive prediction, neural networks, functional form, model/data scales

Related articles: Most relevant | Search more

arXiv:1206.3274 [cs.LG] (Published 2012-06-13)

Small Sample Inference for Generalization Error in Classification Using the CUD Bound

Eric B. Laber, Susan A. Murphy

arXiv:1301.0579 [cs.LG] (Published 2012-12-12)

Almost-everywhere algorithmic stability and generalization error

Samuel Kutin, Partha Niyogi

arXiv:1711.05482 [cs.LG] (Published 2017-11-15)

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

Dhruv Mahajan, Vivek Gupta, S Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi