arXiv:1704.04932 Abstract | arXiv Analytics

arXiv:1704.04932 [cs.LG]Abstract References Reviews Resources

Deep Relaxation: partial differential equations for optimizing deep neural networks

Pratik Chaudhari, Adam Oberman, Stanley Osher, Stefano Soatto, Guillame Carlier

Published 2017-04-17Version 1

We establish connections between non-convex optimization methods for training deep neural networks (DNNs) and the theory of partial differential equations (PDEs). In particular, we focus on relaxation techniques initially developed in statistical physics, which we show to be solutions of a nonlinear Hamilton-Jacobi-Bellman equation. We employ the underlying stochastic control problem to analyze the geometry of the relaxed energy landscape and its convergence properties, thereby confirming empirical evidence. This paper opens non-convex optimization problems arising in deep learning to ideas from the PDE literature. In particular, we show that the non-viscous Hamilton-Jacobi equation leads to an elegant algorithm based on the Hopf-Lax formula that outperforms state-of-the-art methods. Furthermore, we show that these algorithms scale well in practice and can effectively tackle the high dimensionality of modern neural networks.

Categories: cs.LG, math.AP, math.OC

Keywords: partial differential equations, optimizing deep neural networks, deep relaxation, non-convex optimization problems arising, paper opens non-convex optimization problems

Related articles: Most relevant | Search more

arXiv:1804.04272 [cs.LG] (Published 2018-04-12)

Deep Neural Networks motivated by Partial Differential Equations

Lars Ruthotto, Eldad Haber

arXiv:2301.10737 [cs.LG] (Published 2023-01-25)

Distributed Control of Partial Differential Equations Using Convolutional Reinforcement Learning

Sebastian Peitz, Jan Stenner, Vikas Chidananda, Oliver Wallscheid, Steven L. Brunton, Kunihiko Taira

arXiv:2303.17078 [cs.LG] (Published 2023-03-30)

Machine Learning for Partial Differential Equations

Steven L. Brunton, J. Nathan Kutz