arXiv:2212.06355 Abstract | arXiv Analytics

arXiv:2212.06355 [stat.ML]Abstract References Reviews Resources

A Review of Off-Policy Evaluation in Reinforcement Learning

Masatoshi Uehara, Chengchun Shi, Nathan Kallus

Published 2022-12-13Version 1

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.

Comments: Still under revision

Categories: stat.ML, cs.LG, math.ST, stat.ME, stat.TH

Keywords: off-policy evaluation, reinforcement learning, existing state-of-the-art ope methods, computer science literature, vibrant research frontiers

Related articles: Most relevant | Search more

arXiv:2306.04836 [stat.ML] (Published 2023-06-07)

$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control

Michael Giegrich, Roel Oomen, Christoph Reisinger

arXiv:2006.06982 [stat.ML] (Published 2020-06-12)

Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales

Masahiro Kato

arXiv:2108.04763 [stat.ML] (Published 2021-08-10)

Imitation Learning by Reinforcement Learning

Kamil Ciosek

arXiv Analytics

arXiv:2212.06355 [stat.ML]Abstract References Reviews Resources

A Review of Off-Policy Evaluation in Reinforcement Learning

Links

Toolbox

arXiv:2212.06355 [stat.ML]AbstractReferencesReviewsResources

A Review of Off-Policy Evaluation in Reinforcement Learning

Links

Toolbox

arXiv:2212.06355 [stat.ML]Abstract References Reviews Resources