arXiv:2209.08951 Abstract | arXiv Analytics

arXiv:2209.08951 [stat.ML]Abstract References Reviews Resources

Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$-Covers

Sejun Park, Umut Şimşekli, Murat A. Erdogdu

Published 2022-09-19Version 1

In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with $P$ pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by $O(\sqrt{(\log n\log(nP))/n})$, where $n$ is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and $K$-means clustering for both hard and soft label setups, improving the known state-of-the-art rates.

Categories: stat.ML, cs.LG

Keywords: stochastic gradient descent, generalization bounds, multi-class support vector machines, multi-index linear models, standard uniform covering arguments

Related articles: Most relevant | Search more

arXiv:1707.07716 [stat.ML] (Published 2017-07-24)

Stochastic Gradient Descent for Relational Logistic Regression via Partial Network Crawls

Jiasen Yang, Bruno Ribeiro, Jennifer Neville

arXiv:1805.07960 [stat.ML] (Published 2018-05-21)

Stochastic Gradient Descent for Stochastic Doubly-Nonconvex Composite Optimization

Takayuki Kawashima, Hironori Fujisawa

arXiv:2006.10840 [stat.ML] (Published 2020-06-18)

Stochastic Gradient Descent in Hilbert Scales: Smoothness, Preconditioning and Earlier Stopping

Nicole Mücke, Enrico Reiss

arXiv Analytics

arXiv:2209.08951 [stat.ML]Abstract References Reviews Resources

Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$-Covers

Links

Toolbox

arXiv:2209.08951 [stat.ML]AbstractReferencesReviewsResources

Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$-Covers

Links

Toolbox

arXiv:2209.08951 [stat.ML]Abstract References Reviews Resources