arXiv:1905.08165 Abstract | arXiv Analytics

arXiv:1905.08165 [stat.ML]Abstract References Reviews Resources

Gradient Ascent for Active Exploration in Bandit Problems

Published 2019-05-20Version 1

We present a new algorithm based on an gradient ascent for a general Active Exploration bandit problem in the fixed confidence setting. This problem encompasses several well studied problems such that the Best Arm Identification or Thresholding Bandits. It consists of a new sampling rule based on an online lazy mirror ascent. We prove that this algorithm is asymptotically optimal and, most importantly, computationally efficient.

Comments: 21 pages, 1 figure

Categories: stat.ML, cs.LG

Keywords: gradient ascent, general active exploration bandit problem, online lazy mirror ascent, best arm identification, problem encompasses

Related articles: Most relevant | Search more

arXiv:1407.4443 [stat.ML] (Published 2014-07-16, updated 2016-11-14)

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

arXiv:2301.03785 [stat.ML] (Published 2023-01-10)

Best Arm Identification in Stochastic Bandits: Beyond $β-$optimality

Arpan Mukherjee, Ali Tajer

arXiv:2308.12000 [stat.ML] (Published 2023-08-23)

On Uniformly Optimal Algorithms for Best Arm Identification in Two-Armed Bandits with Fixed Budget