arXiv:2310.00327 Abstract | arXiv Analytics

arXiv:2310.00327 [stat.ML]Abstract References Reviews Resources

Memorization with neural nets: going beyond the worst case

Sjoerd Dirksen, Patrick Finke, Martin Genzel

Published 2023-09-30Version 1

In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite dataset with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We illustrate the effectiveness of the algorithm in non-pathological situations with extensive numerical experiments and link the insights back to the theoretical results.

Categories: stat.ML, cs.LG, math.ST, stat.TH

Keywords: neural nets, worst case, worst-case memorization capacity bounds, deep neural networks, interpolating three-layer neural network

Related articles: Most relevant | Search more

arXiv:1805.10965 [stat.ML] (Published 2018-05-28)

Lipschitz regularity of deep neural networks: analysis and efficient estimation

Kevin Scaman, Aladin Virmaux

arXiv:1402.1869 [stat.ML] (Published 2014-02-08, updated 2014-06-07)

On the Number of Linear Regions of Deep Neural Networks

Guido Montúfar, Razvan Pascanu, Kyunghyun Cho, Yoshua Bengio

arXiv:1712.09482 [stat.ML] (Published 2017-12-27)

Robust Loss Functions under Label Noise for Deep Neural Networks

Aritra Ghosh, Himanshu Kumar, P. S. Sastry