arXiv:1902.02234 Abstract | arXiv Analytics

arXiv:1902.02234 [cs.LG]Abstract References Reviews Resources

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Published 2019-02-06Version 1

Though the convergence of major reinforcement learning algorithms has been extensively studied, the finite-sample analysis to further characterize the convergence rate in terms of the sample complexity for problems with continuous state space is still very limited. Such a type of analysis is especially challenging for algorithms with dynamically changing learning policies and under non-i.i.d.\ sampled data. In this paper, we present the first finite-sample analysis for the SARSA algorithm and its minimax variant (for zero-sum Markov games), with a single sample path and linear function approximation. To establish our results, we develop a novel technique to bound the gradient bias for dynamically changing learning policies, which can be of independent interest. We further provide finite-sample bounds for Q-learning and its minimax variant. Comparison of our result with the existing finite-sample bound indicates that linear function approximation achieves order-level lower sample complexity than the nearest neighbor approach.

Categories: cs.LG, stat.ML

Keywords: linear function approximation, finite-sample analysis, function approximation achieves order-level, order-level lower sample complexity, approximation achieves order-level lower sample

Related articles: Most relevant | Search more

arXiv:2005.10175 [cs.LG] (Published 2020-05-20)

Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise

Yue Wang, Shaofeng Zou

arXiv:1911.00934 [cs.LG] (Published 2019-11-03)

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis, Qinmin Yang, Zaiyue Yang

arXiv:2105.12540 [cs.LG] (Published 2021-05-26)

Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation

Zaiwei Chen, Sajad Khodadadian, Siva Theja Maguluri

arXiv Analytics

arXiv:1902.02234 [cs.LG]Abstract References Reviews Resources

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Links

Toolbox

arXiv:1902.02234 [cs.LG]AbstractReferencesReviewsResources

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Links

Toolbox

arXiv:1902.02234 [cs.LG]Abstract References Reviews Resources