arXiv:1907.01285 Abstract | arXiv Analytics

arXiv:1907.01285 [cs.LG]Abstract References Reviews Resources

Learning the Arrow of Time

Nasim Rahaman, Steffen Wolf, Anirudh Goyal, Roman Remme, Yoshua Bengio

Published 2019-07-02Version 1

We humans seem to have an innate understanding of the asymmetric progression of time, which we use to efficiently and safely perceive and manipulate our environment. Drawing inspiration from that, we address the problem of learning an arrow of time in a Markov (Decision) Process. We illustrate how a learned arrow of time can capture meaningful information about the environment, which in turn can be used to measure reachability, detect side-effects and to obtain an intrinsic reward signal. We show empirical results on a selection of discrete and continuous environments, and demonstrate for a class of stochastic processes that the learned arrow of time agrees reasonably well with a known notion of an arrow of time given by the celebrated Jordan-Kinderlehrer-Otto result.

Comments: A shorter version of this work was presented at the Theoretical Phyiscs for Deep Learning Workshop, ICML 2019

Categories: cs.LG, cs.AI

Keywords: environment, learned arrow, stochastic processes, asymmetric progression, intrinsic reward signal

Related articles: Most relevant | Search more

arXiv:2106.10075 [cs.LG] (Published 2021-06-18)

Learning to Plan via a Multi-Step Policy Regression Method

Stefan Wagner, Michael Janschek, Tobias Uelwer, Stefan Harmeling

arXiv:1706.09520 [cs.LG] (Published 2017-06-29)

Neural SLAM

Jingwei Zhang, Lei Tai, Joschka Boedecker, Wolfram Burgard, Ming Liu

arXiv:1708.01289 [cs.LG] (Published 2017-08-03)

Independently Controllable Features

Valentin Thomas et al.

arXiv Analytics

arXiv:1907.01285 [cs.LG]Abstract References Reviews Resources

Learning the Arrow of Time

Links

Toolbox

arXiv:1907.01285 [cs.LG]AbstractReferencesReviewsResources

Learning the Arrow of Time

Links

Toolbox

arXiv:1907.01285 [cs.LG]Abstract References Reviews Resources