arXiv:2006.11647 Abstract | arXiv Analytics

arXiv:2006.11647 [cs.LG]Abstract References Reviews Resources

An Optimal Elimination Algorithm for Learning a Best Arm

Avinatan Hassidim, Ron Kupfer, Yaron Singer

Published 2020-06-20Version 1

We consider the classic problem of $(\epsilon,\delta)$-PAC learning a best arm where the goal is to identify with confidence $1-\delta$ an arm whose mean is an $\epsilon$-approximation to that of the highest mean arm in a multi-armed bandit setting. This problem is one of the most fundamental problems in statistics and learning theory, yet somewhat surprisingly its worst-case sample complexity is not well understood. In this paper, we propose a new approach for $(\epsilon,\delta)$-PAC learning a best arm. This approach leads to an algorithm whose sample complexity converges to \emph{exactly} the optimal sample complexity of $(\epsilon,\delta)$-learning the mean of $n$ arms separately and we complement this result with a conditional matching lower bound. More specifically:

Categories: cs.LG, stat.ML

Keywords: best arm, optimal elimination algorithm, worst-case sample complexity, conditional matching lower bound, sample complexity converges

Related articles: Most relevant | Search more

arXiv:2001.10055 [cs.LG] (Published 2020-01-24)

Ballooning Multi-Armed Bandits

Ganesh Ghalme, Swapnil Dhamal, Shweta Jain, Sujit Gujar, Y. Narahari

arXiv:2410.17835 [cs.LG] (Published 2024-10-23)

Optimal Streaming Algorithms for Multi-Armed Bandits

Tianyuan Jin, Keke Huang, Jing Tang, Xiaokui Xiao

arXiv:2002.00315 [cs.LG] (Published 2020-02-02)

A Closer Look at Small-loss Bounds for Bandits with Graph Feedback

Chung-Wei Lee, Haipeng Luo, Mengxiao Zhang