arXiv:2210.07338 Abstract | arXiv Analytics

arXiv:2210.07338 [cs.LG]Abstract References Reviews Resources

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

Published 2022-10-13Version 1

We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are useful for very large MDPs, including lookahead, function approximation, and gradient descent. Specifically, we analyze two algorithms; the first algorithm involves a least squares approach where a new set of weights associated with feature vectors is obtained via least squares minimization at each iteration and the second algorithm involves a two-time-scale stochastic approximation algorithm taking several steps of gradient descent towards the least squares solution before obtaining the next iterate using a stochastic approximation algorithm.

Comments: 9 pages, 0 figures

Categories: cs.LG, cs.SY, eess.SY

Keywords: linear function approximation, unbiased policy evaluation, reinforcement learning, two-time-scale stochastic approximation algorithm, gradient descent

Related articles: Most relevant | Search more

arXiv:1902.07656 [cs.LG] (Published 2019-02-20)

LOSSGRAD: automatic learning rate in gradient descent

Bartosz Wójcik, Łukasz Maziarka, Jacek Tabor

arXiv:2204.08809 [cs.LG] (Published 2022-04-19)

Making Progress Based on False Discoveries

Roi Livni

arXiv:2203.16462 [cs.LG] (Published 2022-03-30)

Convergence of gradient descent for deep neural networks

Sourav Chatterjee

arXiv Analytics

arXiv:2210.07338 [cs.LG]Abstract References Reviews Resources

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

Links

Toolbox

arXiv:2210.07338 [cs.LG]AbstractReferencesReviewsResources

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

Links

Toolbox

arXiv:2210.07338 [cs.LG]Abstract References Reviews Resources