arXiv:2008.03501 [cs.LG]AbstractReferencesReviewsResources
Why to "grow" and "harvest" deep learning models?
Ilona Kulikovskikh, Tarzan Legović
Published 2020-08-08Version 1
Current expectations from training deep learning models with gradient-based methods include: 1) transparency; 2) high convergence rates; 3) high inductive biases. While the state-of-art methods with adaptive learning rate schedules are fast, they still fail to meet the other two requirements. We suggest reconsidering neural network models in terms of single-species population dynamics where adaptation comes naturally from open-ended processes of "growth" and "harvesting". We show that the stochastic gradient descent (SGD) with two balanced pre-defined values of per capita growth and harvesting rates outperform the most common adaptive gradient methods in all of the three requirements.