arXiv:1807.06343 Abstract | arXiv Analytics

arXiv:1807.06343 [stat.ML]Abstract References Reviews Resources

Learning with SGD and Random Features

Luigi Carratino, Alessandro Rudi, Lorenzo Rosasco

Published 2018-07-17Version 1

Sketching and stochastic gradient methods are arguably the most common tech- niques to derive efficient large-scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradients with mini batches and ran- dom features. The latter can be seen as a form of nonlinear sketching and used to define approximate kernel methods. The estimator we consider is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlight how different parameters, such as the number of features, iterations, step-size and mini- batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments.

Categories: stat.ML, cs.LG

Keywords: random features, deriving optimal finite sample bounds, define approximate kernel methods, stochastic gradient methods, derive efficient large-scale learning algorithms

Related articles: Most relevant | Search more

arXiv:1602.04474 [stat.ML] (Published 2016-02-14)

Generalization Properties of Learning with Random Features

Alessandro Rudi, Raffaello Camoriano, Lorenzo Rosasco

arXiv:2004.11154 [stat.ML] (Published 2020-04-23)

Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

Fanghui Liu, Xiaolin Huang, Yudong Chen, Johan A. K. Suykens

arXiv:2007.00360 [stat.ML] (Published 2020-07-01)

Decentralised Learning with Random Features and Distributed Gradient Descent

Dominic Richards, Patrick Rebeschini, Lorenzo Rosasco