arXiv:2308.03297 Abstract | arXiv Analytics

arXiv:2308.03297 [math.OC]Abstract References Reviews Resources

Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality

Published 2023-08-07Version 1

We consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a uniform-feasibility constraint of the expected total discounted-cost in a deterministic, history-independent, and stationary policy set. We derive a DP-equation that recursively holds for a CMDP problem and its sub-CMDP problems, where each problem, induced from the parameters of the original CMDP problem, admits a uniformly-optimal feasible policy in its policy set associated with the inputs to the problem. A policy constructed from the DP-equation is shown to achieve the optimal values, defined for the CMDP problem the policy is a solution to, at all states. Based on the result, we discuss off-line and on-line computational algorithms, motivated from policy iteration for MDPs, whose output sequences have local convergences for the original CMDP problem.

Categories: math.OC

Keywords: approximate constrained discounted dynamic programming, uniform feasibility, original cmdp problem, infinite-horizon constrained markov decision process, optimality

Related articles: Most relevant | Search more

arXiv:1507.05125 [math.OC] (Published 2015-07-17)

On the Optimality of (s,S) Policies

Eugene A. Feinberg, Mark E. Lewis

arXiv:2502.05081 [math.OC] (Published 2025-02-07)

Stochastic internal habit formation and optimality

Michele Aleandri, Alessandro Bondi, Fausto Gozzi

arXiv:2407.08425 [math.OC] (Published 2024-07-11)

Optimality of vaccination for an SIR epidemic with an ICU constraint

Matteo Della Rossa, Lorenzo Freddi, Dan Goreac