arXiv:2403.02476 Abstract | arXiv Analytics

arXiv:2403.02476 [cs.LG]Abstract References Reviews Resources

A Simple Finite-Time Analysis of TD Learning with Linear Function Approximation

Published 2024-03-04Version 1

We study the finite-time convergence of TD learning with linear function approximation under Markovian sampling. Existing proofs for this setting either assume a projection step in the algorithm to simplify the analysis, or require a fairly intricate argument to ensure stability of the iterates. We ask: \textit{Is it possible to retain the simplicity of a projection-based analysis without actually performing a projection step in the algorithm?} Our main contribution is to show this is possible via a novel two-step argument. In the first step, we use induction to prove that under a standard choice of a constant step-size $\alpha$, the iterates generated by TD learning remain uniformly bounded in expectation. In the second step, we establish a recursion that mimics the steady-state dynamics of TD learning up to a bounded perturbation on the order of $O(\alpha^2)$ that captures the effect of Markovian sampling. Combining these pieces leads to an overall approach that considerably simplifies existing proofs. We conjecture that our inductive proof technique will find applications in the analyses of more complex stochastic approximation algorithms, and conclude by providing some examples of such applications.

Categories: cs.LG, cs.SY, eess.SY, math.OC

Keywords: linear function approximation, simple finite-time analysis, td learning, projection step, complex stochastic approximation algorithms

Related articles: Most relevant | Search more

arXiv:2204.09801 [cs.LG] (Published 2022-04-20)

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

Xingang Guo, Bin Hu

arXiv:2210.07338 [cs.LG] (Published 2022-10-13)

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

Anna Winnicki, R. Srikant

arXiv:1911.00934 [cs.LG] (Published 2019-11-03)

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis, Qinmin Yang, Zaiyue Yang

arXiv Analytics

arXiv:2403.02476 [cs.LG]Abstract References Reviews Resources

A Simple Finite-Time Analysis of TD Learning with Linear Function Approximation

Links

Toolbox

arXiv:2403.02476 [cs.LG]AbstractReferencesReviewsResources

A Simple Finite-Time Analysis of TD Learning with Linear Function Approximation

Links

Toolbox

arXiv:2403.02476 [cs.LG]Abstract References Reviews Resources