arXiv:2309.09240 Abstract | arXiv Analytics

arXiv:2309.09240 [cond-mat.dis-nn]Abstract References Reviews Resources

High-dimensional manifold of solutions in neural networks: insights from statistical physics

Published 2023-09-17Version 1

In these pedagogic notes I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights, in the classification setting. I will review the Gardner's approach based on replica method and the derivation of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged, and how this arrangement changes as the size of the training set increases. I also illustrate how different regions of solution space can be explored analytically and how the landscape in the vicinity of a solution can be characterized. I give evidence how, in binary weight models, algorithmic hardness is a consequence of the disappearance of a clustered region of solutions that extends to very large distances. Finally, I demonstrate how the study of linear mode connectivity between solutions can give insights into the average shape of the solution manifold.

Comments: 22 pages, 9 figures, based on a set of lectures done at the "School of the Italian Society of Statistical Physics", IMT, Lucca

Categories: cond-mat.dis-nn, cs.LG, math.PR, math.ST, stat.TH

Keywords: neural networks, high-dimensional manifold, statistical physics, binary weight models, linear mode connectivity

Tags: lecture notes

Related articles: Most relevant | Search more

arXiv:2302.14112 [cond-mat.dis-nn] (Published 2023-02-27)

Injectivity of ReLU networks: perspectives from statistical physics

Antoine Maillard, Afonso S. Bandeira, David Belius, Ivan Dokmanić, Shuta Nakajima

arXiv:2107.01163 [cond-mat.dis-nn] (Published 2021-07-02)

Unveiling the structure of wide flat minima in neural networks

Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, Gabriele Perugini, Riccardo Zecchina

arXiv:2306.01477 [cond-mat.dis-nn] (Published 2023-06-02)