arXiv:2204.09801 Abstract | arXiv Analytics

arXiv:2204.09801 [cs.LG]Abstract References Reviews Resources

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

Published 2022-04-20Version 1

In this paper, we consider the policy evaluation problem in multi-agent reinforcement learning (MARL) and derive exact closed-form formulas for the finite-time mean-squared estimation errors of decentralized temporal difference (TD) learning with linear function approximation. Our analysis hinges upon the fact that the decentralized TD learning method can be viewed as a Markov jump linear system (MJLS). Then standard MJLS theory can be applied to quantify the mean and covariance matrix of the estimation error of the decentralized TD method at every time step. Various implications of our exact formulas on the algorithm performance are also discussed. An interesting finding is that under a necessary and sufficient stability condition, the mean-squared TD estimation error will converge to an exact limit at a specific exponential rate.

Categories: cs.LG, cs.SY, eess.SY, math.OC

Keywords: linear function approximation, decentralized temporal difference learning, finite-time estimation errors, exact formulas, markov jump linear system

Related articles: Most relevant | Search more

arXiv:2403.02476 [cs.LG] (Published 2024-03-04)

A Simple Finite-Time Analysis of TD Learning with Linear Function Approximation

Aritra Mitra

arXiv:2210.07338 [cs.LG] (Published 2022-10-13)

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

Anna Winnicki, R. Srikant

arXiv:1911.00934 [cs.LG] (Published 2019-11-03)

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis, Qinmin Yang, Zaiyue Yang

arXiv Analytics

arXiv:2204.09801 [cs.LG]Abstract References Reviews Resources

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

Links

Toolbox

arXiv:2204.09801 [cs.LG]AbstractReferencesReviewsResources

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

Links

Toolbox

arXiv:2204.09801 [cs.LG]Abstract References Reviews Resources