arXiv Analytics

Sign in

arXiv:2210.07931 [stat.ML]AbstractReferencesReviewsResources

Sequential Learning Of Neural Networks for Prequential MDL

Jorg Bornschein, Yazhe Li, Marcus Hutter

Published 2022-10-14Version 1

Minimum Description Length (MDL) provides a framework and an objective for principled model evaluation. It formalizes Occam's Razor and can be applied to data from non-stationary sources. In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation. It thus closely resembles a continual- or online-learning problem. In this study, we evaluate approaches for computing prequential description lengths for image classification datasets with neural networks. Considering the computational cost, we find that online-learning with rehearsal has favorable performance compared to the previously widely used block-wise estimation. We propose forward-calibration to better align the models predictions with the empirical observations and introduce replay-streams, a minibatch incremental training technique to efficiently implement approximate random replay while avoiding large in-memory replay buffers. As a result, we present description lengths for a suite of image classification datasets that improve upon previously reported results by large margins.

Related articles: Most relevant | Search more
arXiv:2006.08437 [stat.ML] (Published 2020-06-15)
Depth Uncertainty in Neural Networks
arXiv:2007.12826 [stat.ML] (Published 2020-07-25)
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training
arXiv:1705.05598 [stat.ML] (Published 2017-05-16)
PatternNet and PatternLRP -- Improving the interpretability of neural networks