arXiv:2207.01115 Abstract | arXiv Analytics

arXiv:2207.01115 [cs.LG]Abstract References Reviews Resources

USHER: Unbiased Sampling for Hindsight Experience Replay

Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias

Published 2022-07-03Version 1

Dealing with sparse rewards is a long-standing challenge in reinforcement learning (RL). Hindsight Experience Replay (HER) addresses this problem by reusing failed trajectories for one goal as successful trajectories for another. This allows for both a minimum density of reward and for generalization across multiple goals. However, this strategy is known to result in a biased value function, as the update rule underestimates the likelihood of bad outcomes in a stochastic environment. We propose an asymptotically unbiased importance-sampling-based algorithm to address this problem without sacrificing performance on deterministic environments. We show its effectiveness on a range of robotic systems, including challenging high dimensional stochastic environments.

Categories: cs.LG, cs.AI, cs.RO

Keywords: hindsight experience replay, unbiased sampling, challenging high dimensional stochastic environments, update rule underestimates, multiple goals

Related articles: Most relevant | Search more

arXiv:1707.01495 [cs.LG] (Published 2017-07-05)

Hindsight Experience Replay

Marcin Andrychowicz et al.

arXiv:2008.09377 [cs.LG] (Published 2020-08-21)

Curriculum Learning with Hindsight Experience Replay for Sequential Object Manipulation Tasks

Binyamin Manela, Armin Biess

arXiv:1809.02070 [cs.LG] (Published 2018-09-06)

ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience Replay

Sameera Lanka, Tianfu Wu

arXiv Analytics

arXiv:2207.01115 [cs.LG]Abstract References Reviews Resources

USHER: Unbiased Sampling for Hindsight Experience Replay

Links

Toolbox

arXiv:2207.01115 [cs.LG]AbstractReferencesReviewsResources

USHER: Unbiased Sampling for Hindsight Experience Replay

Links

Toolbox

arXiv:2207.01115 [cs.LG]Abstract References Reviews Resources