arXiv Analytics

Sign in

arXiv:2101.10895 [math.OC]AbstractReferencesReviewsResources

A Primal-Dual Approach to Constrained Markov Decision Processes

Yi Chen, Jing Dong, Zhaoran Wang

Published 2021-01-26Version 1

In many operations management problems, we need to make decisions sequentially to minimize the cost while satisfying certain constraints. One modeling approach to study such problems is constrained Markov decision process (CMDP). When solving the CMDP to derive good operational policies, there are two key challenges: one is the prohibitively large state space and action space; the other is the hard-to-compute transition kernel. In this work, we develop a sampling-based primal-dual algorithm to solve CMDPs. Our approach alternatively applies regularized policy iteration to improve the policy and subgradient ascent to maintain the constraints. Under mild regularity conditions, we show that the algorithm converges at rate $ O(\log(T)/\sqrt{T})$, where T is the number of iterations. When the CMDP has a weakly coupled structure, our approach can substantially reduce the dimension of the problem through an embedded decomposition. We apply the algorithm to two important applications with weakly coupled structures: multi-product inventory management and multi-class queue scheduling, and show that it generates controls that outperform state-of-art heuristics.

Related articles: Most relevant | Search more
arXiv:2206.01666 [math.OC] (Published 2022-06-03)
Algorithm for Constrained Markov Decision Process with Linear Convergence
arXiv:1901.08731 [math.OC] (Published 2019-01-25)
A Primal-Dual Approach to Markovian Network Optimization
arXiv:1108.0128 [math.OC] (Published 2011-07-31)
Delay Optimal Multichannel Opportunistic Access