arXiv Analytics

Sign in

arXiv:1611.07476 [cs.LG]AbstractReferencesReviewsResources

Singularity of the Hessian in Deep Learning

Levent Sagun, Leon Bottou, Yann LeCun

Published 2016-11-22Version 1

We look at the eigenvalues of the Hessian of a loss function before and after training. The eigenvalue distribution is seen to be composed of two parts, the bulk which is concentrated around zero, and the edges which are scattered away from zero. We present empirical evidence for the bulk indicating how over-parametrized the system is, and for the edges indicating the complexity of the input data.

Comments: ICLR 2017 Submission on Nov 4, 2016
Categories: cs.LG
Related articles: Most relevant | Search more
arXiv:1706.10239 [cs.LG] (Published 2017-06-30)
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
arXiv:1702.07800 [cs.LG] (Published 2017-02-24)
On the Origin of Deep Learning
arXiv:1506.00619 [cs.LG] (Published 2015-06-01)
Blocks and Fuel: Frameworks for deep learning