arXiv:2210.07931 Abstract | arXiv Analytics

arXiv:2210.07931 [stat.ML]Abstract References Reviews Resources

Sequential Learning Of Neural Networks for Prequential MDL

Jorg Bornschein, Yazhe Li, Marcus Hutter

Published 2022-10-14Version 1

Minimum Description Length (MDL) provides a framework and an objective for principled model evaluation. It formalizes Occam's Razor and can be applied to data from non-stationary sources. In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation. It thus closely resembles a continual- or online-learning problem. In this study, we evaluate approaches for computing prequential description lengths for image classification datasets with neural networks. Considering the computational cost, we find that online-learning with rehearsal has favorable performance compared to the previously widely used block-wise estimation. We propose forward-calibration to better align the models predictions with the empirical observations and introduce replay-streams, a minibatch incremental training technique to efficiently implement approximate random replay while avoiding large in-memory replay buffers. As a result, we present description lengths for a suite of image classification datasets that improve upon previously reported results by large margins.

Categories: stat.ML, cs.LG

Keywords: neural networks, prequential mdl, implement approximate random replay, description length, image classification datasets

Related articles: Most relevant | Search more

arXiv:2006.08437 [stat.ML] (Published 2020-06-15)

Depth Uncertainty in Neural Networks

Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

arXiv:2007.12826 [stat.ML] (Published 2020-07-25)

The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training

Andrea Montanari, Yiqiao Zhong

arXiv:1705.05598 [stat.ML] (Published 2017-05-16)

PatternNet and PatternLRP -- Improving the interpretability of neural networks

Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Sven Dähne