arXiv Analytics

Sign in

arXiv:1710.05468 [stat.ML]AbstractReferencesReviewsResources

Generalization in Deep Learning

Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

Published 2017-10-16Version 1

This paper explains why deep learning can generalize well, despite large capacity and possible algorithmic instability, nonrobustness, and sharp minima, effectively addressing an open problem in the literature. Based on our theoretical insight, this paper also proposes a family of new regularization methods. Its simplest member was empirically shown to improve base models and achieve state-of-the-art performance on MNIST and CIFAR-10 benchmarks. Moreover, this paper presents both data-dependent and data-independent generalization guarantees with improved convergence rates. Our results suggest several new open areas of research.

Related articles: Most relevant | Search more
arXiv:2012.04115 [stat.ML] (Published 2020-12-07)
Generalization bounds for deep learning
arXiv:1706.02052 [stat.ML] (Published 2017-06-07)
Are Saddles Good Enough for Deep Learning?
arXiv:1705.08665 [stat.ML] (Published 2017-05-24)
Bayesian Compression for Deep Learning