arXiv Analytics

Sign in

arXiv:2312.00427 [stat.ML]AbstractReferencesReviewsResources

From Mutual Information to Expected Dynamics: New Generalization Bounds for Heavy-Tailed SGD

Benjamin Dupuis, Paul Viallard

Published 2023-12-01Version 1

Understanding the generalization abilities of modern machine learning algorithms has been a major research topic over the past decades. In recent years, the learning dynamics of Stochastic Gradient Descent (SGD) have been related to heavy-tailed dynamics. This has been successfully applied to generalization theory by exploiting the fractal properties of those dynamics. However, the derived bounds depend on mutual information (decoupling) terms that are beyond the reach of computability. In this work, we prove generalization bounds over the trajectory of a class of heavy-tailed dynamics, without those mutual information terms. Instead, we introduce a geometric decoupling term by comparing the learning dynamics (depending on the empirical risk) with an expected one (depending on the population risk). We further upper-bound this geometric term, by using techniques from the heavy-tailed and the fractal literature, making it fully computable. Moreover, as an attempt to tighten the bounds, we propose a PAC-Bayesian setting based on perturbed dynamics, in which the same geometric term plays a crucial role and can still be bounded using the techniques described above.

Comments: Accepted in the NeurIPS 2023 Workshop Heavy Tails in Machine Learning
Categories: stat.ML, cs.LG
Related articles: Most relevant | Search more
arXiv:2111.09831 [stat.ML] (Published 2021-11-18, updated 2022-09-08)
Causal Forecasting:Generalization Bounds for Autoregressive Models
arXiv:2405.09516 [stat.ML] (Published 2024-05-15)
Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
arXiv:1902.00985 [stat.ML] (Published 2019-02-03)
Adversarial Networks and Autoencoders: The Primal-Dual Relationship and Generalization Bounds