arXiv:1904.06312 Abstract | arXiv Analytics

arXiv:1904.06312 [cs.LG]Abstract References Reviews Resources

Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments

Kaleigh Clary, Emma Tosch, John Foley, David Jensen

Published 2019-04-12Version 1

Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunately, there are still pernicious sources of variability in reinforcement learning agents that make reporting common summary statistics an unsound metric for performance. Our experiments demonstrate the variability of common agents used in the popular OpenAI Baselines repository. We make the case for reporting post-training agent performance as a distribution, rather than a point estimate.

Comments: NeurIPS 2018 Critiquing and Correcting Trends Workshop

Categories: cs.LG, cs.AI, stat.ML

Keywords: deep reinforcement learning agents, atari environments, lets play, variability, popular openai baselines repository

Related articles: Most relevant | Search more

arXiv:2104.13207 [cs.LG] (Published 2021-04-27)

SocialAI 0.1: Towards a Benchmark to Stimulate Research on Socio-Cognitive Abilities in Deep Reinforcement Learning Agents

Grgur Kovač, Rémy Portelas, Katja Hofmann, Pierre-Yves Oudeyer

arXiv:2308.02594 [cs.LG] (Published 2023-08-03)

SMARLA: A Safety Monitoring Approach for Deep Reinforcement Learning Agents

Amirhossein Zolfagharian, Manel Abdellatif, Lionel C. Briand, Ramesh S

arXiv:2107.00956 [cs.LG] (Published 2021-07-02)

SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents