arXiv:1611.07476 Abstract | arXiv Analytics

arXiv:1611.07476 [cs.LG]Abstract References Reviews Resources

Singularity of the Hessian in Deep Learning

Published 2016-11-22Version 1

We look at the eigenvalues of the Hessian of a loss function before and after training. The eigenvalue distribution is seen to be composed of two parts, the bulk which is concentrated around zero, and the edges which are scattered away from zero. We present empirical evidence for the bulk indicating how over-parametrized the system is, and for the edges indicating the complexity of the input data.

Comments: ICLR 2017 Submission on Nov 4, 2016

Categories: cs.LG

Keywords: deep learning, singularity, input data, loss function, eigenvalue distribution

Related articles: Most relevant | Search more

arXiv:1706.10239 [cs.LG] (Published 2017-06-30)

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

Lei Wu, Zhanxing Zhu, Weinan E

arXiv:1702.07800 [cs.LG] (Published 2017-02-24)

On the Origin of Deep Learning

Haohan Wang, Bhiksha Raj, Eric P. Xing

arXiv:1506.00619 [cs.LG] (Published 2015-06-01)

Blocks and Fuel: Frameworks for deep learning