arXiv:2109.14142 Abstract | arXiv Analytics

arXiv:2109.14142 [cs.LG]Abstract References Reviews Resources

On the Provable Generalization of Recurrent Neural Networks

Published 2021-09-29, updated 2022-01-26Version 4

Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works: 1) For a RNN with input sequence $x=(X_1,X_2,...,X_L)$, previous works study to learn functions that are summation of $f(\beta^T_lX_l)$ and require normalized conditions that $||X_l||\leq\epsilon$ with some very small $\epsilon$ depending on the complexity of $f$. In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length $L$. 2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form $f(\beta^T[X_{l_1},...,X_{l_N}])$, which do not belong to the "additive" concept class, i,e., the summation of function $f(X_l)$. And we show that when either $N$ or $l_0=\max(l_1,..,l_N)-\min(l_1,..,l_N)$ is small, $f(\beta^T[X_{l_1},...,X_{l_N}])$ will be learnable with the number iterations and samples scaling almost-polynomially in the input length $L$.

Comments: Accepted to Neurips 2021

Categories: cs.LG, stat.ML

Keywords: recurrent neural network, provable generalization, generalization error bound, notable concept classes, learn functions

Related articles: Most relevant | Search more

arXiv:1206.6412 [cs.LG] (Published 2012-06-27)

A Simple Algorithm for Semi-supervised Learning with Improved Generalization Error Bound

Ming Ji, Tianbao Yang, Binbin Lin, Rong Jin, Jiawei Han

arXiv:1904.02931 [cs.LG] (Published 2019-04-05)

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces

Takamasa Okudono, Masaki Waga, Taro Sekiyama, Ichiro Hasuo

arXiv:1807.01473 [cs.LG] (Published 2018-07-04)

Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation