arXiv:2407.19353 Abstract | arXiv Analytics

arXiv:2407.19353 [cond-mat.dis-nn]Abstract References Reviews Resources

A spring-block theory of feature learning in deep neural networks

Published 2024-07-28Version 1

A central question in deep learning is how deep neural networks (DNNs) learn features. DNN layers progressively collapse data into a regular low-dimensional geometry. This collective effect of non-linearity, noise, learning rate, width, depth, and numerous other parameters, has eluded first-principles theories which are built from microscopic neuronal dynamics. Here we present a noise-non-linearity phase diagram that highlights where shallow or deep layers learn features more effectively. We then propose a macroscopic mechanical theory of feature learning that accurately reproduces this phase diagram, offering a clear intuition for why and how some DNNs are ``lazy'' and some are ``active'', and relating the distribution of feature learning over layers with test accuracy.

Categories: cond-mat.dis-nn, cond-mat.stat-mech, cs.LG, stat.ML

Keywords: deep neural networks, feature learning, spring-block theory, dnn layers progressively collapse data, phase diagram

Related articles: Most relevant | Search more

arXiv:2501.19281 [cond-mat.dis-nn] (Published 2025-01-31)

Statistical Physics of Deep Neural Networks: Generalization Capability, Beyond the Infinite Width, and Feature Learning

Sebastiano Ariosto

arXiv:2306.12548 [cond-mat.dis-nn] (Published 2023-06-21)

Finite-time Lyapunov exponents of deep neural networks

L. Storm, H. Linander, J. Bec, K. Gustavsson, B. Mehlig

arXiv:1808.00408 [cond-mat.dis-nn] (Published 2018-08-01)

Geometry of energy landscapes and the optimizability of deep neural networks

Simon Becker, Yao Zhang, Alpha A. Lee