arXiv:2006.10459 Abstract | arXiv Analytics

arXiv:2006.10459 [stat.ML]Abstract References Reviews Resources

Stochastic bandits with arm-dependent delays

Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko

Published 2020-06-18Version 1

Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds.

Comments: 19 Pages, 4 figures

Categories: stat.ML, cs.LG

Subjects: 62L10

Keywords: arm-dependent delays, stochastic bandits, performance lower bounds, delay distributions vary, efficient ucb-based algorithm

Related articles: Most relevant | Search more

arXiv:2006.08850 [stat.ML] (Published 2020-06-16)

Finding All ε-Good Arms in Stochastic Bandits

Blake Mason, Lalit Jain, Ardhendu Tripathy, Robert Nowak

arXiv:1707.02649 [stat.ML] (Published 2017-07-09)

Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits

Shahin Shahrampour, Vahid Tarokh

arXiv:1906.02685 [stat.ML] (Published 2019-06-06)