arXiv:1710.05468 Abstract | arXiv Analytics

arXiv:1710.05468 [stat.ML]Abstract References Reviews Resources

Generalization in Deep Learning

Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

Published 2017-10-16Version 1

This paper explains why deep learning can generalize well, despite large capacity and possible algorithmic instability, nonrobustness, and sharp minima, effectively addressing an open problem in the literature. Based on our theoretical insight, this paper also proposes a family of new regularization methods. Its simplest member was empirically shown to improve base models and achieve state-of-the-art performance on MNIST and CIFAR-10 benchmarks. Moreover, this paper presents both data-dependent and data-independent generalization guarantees with improved convergence rates. Our results suggest several new open areas of research.

Categories: stat.ML, cs.AI, cs.LG, cs.NE

Keywords: deep learning, despite large capacity, achieve state-of-the-art performance, data-independent generalization guarantees, algorithmic instability

Related articles: Most relevant | Search more

arXiv:2012.04115 [stat.ML] (Published 2020-12-07)

Generalization bounds for deep learning

Guillermo Valle-Pérez, Ard A. Louis

arXiv:1706.02052 [stat.ML] (Published 2017-06-07)

Are Saddles Good Enough for Deep Learning?

Adepu Ravi Sankar, Vineeth N Balasubramanian

arXiv:1705.08665 [stat.ML] (Published 2017-05-24)

Bayesian Compression for Deep Learning

Christos Louizos, Karen Ullrich, Max Welling