arXiv:2106.08229 Abstract | arXiv Analytics

arXiv:2106.08229 [cs.LG]Abstract References Reviews Resources

MICo: Learning improved representations via sampling-based state similarity for Markov decision processes

Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

Published 2021-06-03Version 1

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents. While existing notions of state similarity are typically difficult to learn at scale due to high computational cost and lack of sample-based algorithms, our newly-proposed distance addresses both of these issues. In addition to providing detailed theoretical analysis, we provide empirical evidence that learning this distance alongside the value function yields structured and informative representations, including strong results on the Arcade Learning Environment benchmark.

Categories: cs.LG, cs.AI

Keywords: markov decision process, sampling-based state similarity, representations, high computational cost, arcade learning environment benchmark

Related articles: Most relevant | Search more

arXiv:1906.03804 [cs.LG] (Published 2019-06-10)

On the Optimality of Sparse Model-Based Planning for Markov Decision Processes

Alekh Agarwal, Sham Kakade, Lin F. Yang

arXiv:2111.10297 [cs.LG] (Published 2021-11-19)

Expert-Guided Symmetry Detection in Markov Decision Processes

Giorgio Angelotti, Nicolas Drougard, Caroline P. C. Chanel

arXiv:1910.01074 [cs.LG] (Published 2019-10-02)

Formal Language Constraints for Markov Decision Processes

Eleanor Quint, Dong Xu, Haluk Dogan, Zeynep Hakguder, Stephen Scott, Matthew Dwyer