arXiv:2009.08372 Abstract | arXiv Analytics

arXiv:2009.08372 [stat.ML]Abstract References Reviews Resources

A Principle of Least Action for the Training of Neural Networks

Skander Karkar, Ibrahhim Ayed, Emmanuel de Bézenac, Patrick Gallinari

Published 2020-09-17Version 1

Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternate perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network's behavior through its displacements, we show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: finding neural networks which solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the given task, and leads to networks with a high generalization ability even in low data regimes.

Categories: stat.ML, cs.LG

Keywords: neural network, system displacing input particles, statistical learning theory struggles, low kinetic energy displacement bias, generalization performance

Related articles: Most relevant | Search more

arXiv:2304.03096 [stat.ML] (Published 2023-04-06)

Spectral Gap Regularization of Neural Networks

Edric Tam, David Dunson

arXiv:2106.13682 [stat.ML] (Published 2021-06-25)

Prediction of Hereditary Cancers Using Neural Networks

Zoe Guan, Giovanni Parmigiani, Danielle Braun, Lorenzo Trippa

arXiv:1503.02531 [stat.ML] (Published 2015-03-09)

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, Jeff Dean