arXiv:1602.02823 Abstract | arXiv Analytics

arXiv:1602.02823 [cs.LG]Abstract References Reviews Resources

Poor starting points in machine learning

Published 2016-02-09Version 1

Poor (even random) starting points for learning/training/optimization are common in machine learning. In many settings, the method of Robbins and Monro (online stochastic gradient descent) is known to be optimal for good starting points, but may not be optimal for poor starting points -- indeed, for poor starting points Nesterov acceleration can help during the initial iterations, even though Nesterov methods not designed for stochastic approximation could hurt during later iterations. The common practice of training with nontrivial minibatches enhances the advantage of Nesterov acceleration.

Comments: 11 pages, 3 figures, 1 table; this initial version is literally identical to that circulated among a restricted audience over a month ago

Categories: cs.LG, cs.NE, math.OC, stat.ML

Keywords: machine learning, online stochastic gradient descent, poor starting points nesterov acceleration, nontrivial minibatches enhances, iterations

Related articles: Most relevant | Search more

arXiv:1510.02533 [cs.LG] (Published 2015-10-09)

New Optimisation Methods for Machine Learning

Aaron Defazio

arXiv:1506.00976 [cs.LG] (Published 2015-06-02)

Toward a generic representation of random variables for machine learning

Gautier Marti, Philippe Very, Philippe Donnat

arXiv:1808.00931 [cs.LG] (Published 2018-08-02)

Machine Learning of Space-Fractional Differential Equations

Mamikon Gulian, Maziar Raissi, Paris Perdikaris, George Karniadakis

arXiv Analytics

arXiv:1602.02823 [cs.LG]Abstract References Reviews Resources

Poor starting points in machine learning

Links

Toolbox

arXiv:1602.02823 [cs.LG]AbstractReferencesReviewsResources

Poor starting points in machine learning

Links

Toolbox

arXiv:1602.02823 [cs.LG]Abstract References Reviews Resources