arXiv:2210.05918 Abstract | arXiv Analytics

arXiv:2210.05918 [cs.LG]Abstract References Reviews Resources

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup

Published 2022-10-12Version 1

We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging. We derive finite time bounds on the parameter error of the tail-averaged TD iterate under a step-size choice that does not require information about the eigenvalues of the matrix underlying the projected TD fixed point. Our analysis shows that tail-averaged TD converges at the optimal $O\left(1/t\right)$ rate, both in expectation and with high probability. In addition, our bounds exhibit a sharper rate of decay for the initial error (bias), which is an improvement over averaging all iterates. We also propose and analyse a variant of TD that incorporates regularisation. From analysis, we conclude that the regularised version of TD is useful for problems with ill-conditioned features.

Categories: cs.LG, cs.AI, cs.SY, eess.SY, stat.ML

Keywords: finite time analysis, linear function approximation, temporal difference learning, tail averaging, regularisation

Related articles: Most relevant | Search more

arXiv:1806.02450 [cs.LG] (Published 2018-06-06)

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Jalaj Bhandari, Daniel Russo, Raghav Singal

arXiv:2406.07892 [cs.LG] (Published 2024-06-12)

Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP

Tejaram Sangadi, L. A. Prashanth, Krishna Jagannathan

arXiv:2204.09801 [cs.LG] (Published 2022-04-20)

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

Xingang Guo, Bin Hu

arXiv Analytics

arXiv:2210.05918 [cs.LG]Abstract References Reviews Resources

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

Links

Toolbox

arXiv:2210.05918 [cs.LG]AbstractReferencesReviewsResources

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

Links

Toolbox

arXiv:2210.05918 [cs.LG]Abstract References Reviews Resources