arXiv:2108.00051 Abstract | arXiv Analytics

arXiv:2108.00051 [cs.LG]Abstract References Reviews Resources

Coordinate descent on the orthogonal group for recurrent neural network training

Published 2021-07-30Version 1

We propose to use stochastic Riemannian coordinate descent on the orthogonal group for recurrent neural network training. The algorithm rotates successively two columns of the recurrent matrix, an operation that can be efficiently implemented as a multiplication by a Givens matrix. In the case when the coordinate is selected uniformly at random at each iteration, we prove the convergence of the proposed algorithm under standard assumptions on the loss function, stepsize and minibatch noise. In addition, we numerically demonstrate that the Riemannian gradient in recurrent neural network training has an approximately sparse structure. Leveraging this observation, we propose a faster variant of the proposed algorithm that relies on the Gauss-Southwell rule. Experiments on a benchmark recurrent neural network training problem are presented to demonstrate the effectiveness of the proposed algorithm.

Categories: cs.LG, math.OC

Keywords: orthogonal group, stochastic riemannian coordinate descent, recurrent neural network training problem, benchmark recurrent neural network training

Related articles: Most relevant | Search more

arXiv:2003.13563 [cs.LG] (Published 2020-03-30)

Stochastic Flows and Geometric Optimization on the Orthogonal Group

Krzysztof Choromanski et al.

arXiv:1906.02435 [cs.LG] (Published 2019-06-06)

Complete Dictionary Learning via $\ell^4$-Norm Maximization over the Orthogonal Group

Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, Yi Ma

arXiv:cs/0001004 [cs.LG] (Published 2000-01-07)

Multiplicative Algorithm for Orthgonal Groups and Independent Component Analysis

Toshinao Akuzawa

arXiv Analytics

arXiv:2108.00051 [cs.LG]Abstract References Reviews Resources

Coordinate descent on the orthogonal group for recurrent neural network training

Links

Toolbox

arXiv:2108.00051 [cs.LG]AbstractReferencesReviewsResources

Coordinate descent on the orthogonal group for recurrent neural network training

Links

Toolbox

arXiv:2108.00051 [cs.LG]Abstract References Reviews Resources