arXiv Analytics

Sign in

arXiv:2012.13739 [math.PR]AbstractReferencesReviewsResources

Transience in Countable MDPs

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

Published 2020-12-26Version 1

The Transient objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transient in countably infinite MDPs. 1. There exist uniformly $\epsilon$-optimal MD strategies (memoryless deterministic) for Transient, even in infinitely branching MDPs. 2. Optimal strategies for Transient need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3. If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., $\epsilon$-optimal strategies for Safety and co-B\"uchi and optimal strategies for $\{0,1,2\}$-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Related articles: Most relevant | Search more
arXiv:1509.03327 [math.PR] (Published 2015-09-08)
Optimal Strategy in "Guess Who?"
arXiv:1509.00925 [math.PR] (Published 2015-09-03)
On Recurrence and Transience of Two-Dimensional Lévy and Lévy-Type Processes
arXiv:1102.4758 [math.PR] (Published 2011-02-23, updated 2011-07-17)
On the transience of random interlacements