arXiv:2101.10895 Abstract | arXiv Analytics

arXiv:2101.10895 [math.OC]Abstract References Reviews Resources

A Primal-Dual Approach to Constrained Markov Decision Processes

Published 2021-01-26Version 1

In many operations management problems, we need to make decisions sequentially to minimize the cost while satisfying certain constraints. One modeling approach to study such problems is constrained Markov decision process (CMDP). When solving the CMDP to derive good operational policies, there are two key challenges: one is the prohibitively large state space and action space; the other is the hard-to-compute transition kernel. In this work, we develop a sampling-based primal-dual algorithm to solve CMDPs. Our approach alternatively applies regularized policy iteration to improve the policy and subgradient ascent to maintain the constraints. Under mild regularity conditions, we show that the algorithm converges at rate $ O(\log(T)/\sqrt{T})$, where T is the number of iterations. When the CMDP has a weakly coupled structure, our approach can substantially reduce the dimension of the problem through an embedded decomposition. We apply the algorithm to two important applications with weakly coupled structures: multi-product inventory management and multi-class queue scheduling, and show that it generates controls that outperform state-of-art heuristics.

Categories: math.OC

Keywords: constrained markov decision process, primal-dual approach, applies regularized policy iteration, weakly coupled structure, operations management problems

Related articles: Most relevant | Search more

arXiv:2206.01666 [math.OC] (Published 2022-06-03)

Algorithm for Constrained Markov Decision Process with Linear Convergence

Egor Gladin, Maksim Lavrik-Karmazin, Karina Zainullina, Varvara Rudenko, Alexander Gasnikov, Martin Takáč

arXiv:1901.08731 [math.OC] (Published 2019-01-25)

A Primal-Dual Approach to Markovian Network Optimization

Yue Yu, Dan Calderone, Sarah H. Q. Li, Lillian J. Ratliff, Behçet Açıkmeşe

arXiv:1108.0128 [math.OC] (Published 2011-07-31)

Delay Optimal Multichannel Opportunistic Access

Shiyao Chen, Lang Tong, Qing Zhao