arXiv:2012.13739 Abstract | arXiv Analytics

arXiv:2012.13739 [math.PR]Abstract References Reviews Resources

Transience in Countable MDPs

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

Published 2020-12-26Version 1

The Transient objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transient in countably infinite MDPs. 1. There exist uniformly $\epsilon$-optimal MD strategies (memoryless deterministic) for Transient, even in infinitely branching MDPs. 2. Optimal strategies for Transient need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3. If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., $\epsilon$-optimal strategies for Safety and co-B\"uchi and optimal strategies for $\{0,1,2\}$-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Categories: math.PR, cs.FL, cs.GT, math.OC

Keywords: optimal strategy, countable mdps, optimal md strategy, finite markov decision process, transience

Related articles: Most relevant | Search more

arXiv:1509.03327 [math.PR] (Published 2015-09-08)

Optimal Strategy in "Guess Who?"

Mihai Nica

arXiv:1509.00925 [math.PR] (Published 2015-09-03)

On Recurrence and Transience of Two-Dimensional Lévy and Lévy-Type Processes

Nikola Sandrić

arXiv:1102.4758 [math.PR] (Published 2011-02-23, updated 2011-07-17)

On the transience of random interlacements

Balázs Ráth, Artëm Sapozhnikov

arXiv Analytics

arXiv:2012.13739 [math.PR]Abstract References Reviews Resources

Transience in Countable MDPs

Links

Toolbox

arXiv:2012.13739 [math.PR]AbstractReferencesReviewsResources

Transience in Countable MDPs

Links

Toolbox

arXiv:2012.13739 [math.PR]Abstract References Reviews Resources