arXiv:2406.03734 Abstract | arXiv Analytics

arXiv:2406.03734 [math.OC]Abstract References Reviews Resources

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Published 2024-06-06Version 1

In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradient primal-dual method to find an optimal state feedback gain. Despite the non-convexity of the cost-constrained LQR problem, we provide a constructive proof for strong duality and a geometric interpretation of an optimal multiplier set. By proving that the concave dual function is Lipschitz smooth, we further provide convergence guarantees for the PG primal-dual method. Finally, we perform simulations to validate our theoretical findings.

Categories: math.OC, cs.SY, eess.SY

Keywords: policy gradient methods, cost-constrained lqr, strong duality, global convergence, policy gradient primal-dual method

Related articles: Most relevant | Search more

arXiv:2202.02914 [math.OC] (Published 2022-02-07)

Global convergence and optimality of the heavy ball method for non-convex optimization

Valery Ugrinovskii, Ian R. Petersen, Iman Shames

arXiv:1910.09496 [math.OC] (Published 2019-10-21)

Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence

Kaiqing Zhang, Bin Hu, Tamer Başar

arXiv:2211.04051 [math.OC] (Published 2022-11-08)

Global Convergence of Policy Gradient Methods for Output Feedback Linear Quadratic Control

Feiran Zhao, Xingyun Fu, Keyou You

arXiv Analytics

arXiv:2406.03734 [math.OC]Abstract References Reviews Resources

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Links

Toolbox

arXiv:2406.03734 [math.OC]AbstractReferencesReviewsResources

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Links

Toolbox

arXiv:2406.03734 [math.OC]Abstract References Reviews Resources