arXiv:1205.3181 [cs.LG]AbstractReferencesReviewsResources
Multiple Identifications in Multi-Armed Bandits
Sébastien Bubeck, Tengyao Wang, Nitin Viswanathan
Published 2012-05-14Version 1
We study the problem of identifying the top $m$ arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.
Related articles: Most relevant | Search more
arXiv:2211.06883 [cs.LG] (Published 2022-11-13)
Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards
arXiv:1911.05142 [cs.LG] (Published 2019-11-12)
Incentivized Exploration for Multi-Armed Bandits under Reward Drift
arXiv:1911.09458 [cs.LG] (Published 2019-11-21)
Observe Before Play: Multi-armed Bandit with Pre-observations