arXiv:1904.07272 Abstract | arXiv Analytics

arXiv:1904.07272 [cs.LG]Abstract References Reviews Resources

Introduction to Multi-Armed Bandits

Published 2019-04-15Version 1

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a review of the more advanced results. The chapters are as follows: Stochastic bandits; Lower bounds; Bayesian Bandits and Thompson Sampling; Lipschitz Bandits; Full Feedback and Adversarial Costs; Adversarial Bandits; Linear Costs and Semi-bandits; Contextual Bandits; Bandits and Zero-Sum Games; Bandits with Knapsacks; Incentivized Exploration and Connections to Mechanism Design. Status of the manuscript: essentially complete (modulo some polishing), except for last chapter, which the author plans to add over the next few months.

Categories: cs.LG, cs.AI, cs.DS, stat.ML

Keywords: multi-armed bandits, author plans, chapter tackles, stochastic bandits, lower bounds

Related articles: Most relevant | Search more

arXiv:1307.5438 [cs.LG] (Published 2013-07-20, updated 2014-10-05)

Towards Distribution-Free Multi-Armed Bandits with Combinatorial Strategies

Xiang-yang Li, Yaqin Zhou

arXiv:2006.06856 [cs.LG] (Published 2020-06-11)

Bandit-PAM: Almost Linear Time $k$-Medoids Clustering via Multi-Armed Bandits

Mo Tiwari, Martin Jinye Zhang, James Mayclin, Sebastian Thrun, Chris Piech, Ilan Shomorony

arXiv:0909.1334 [cs.LG] (Published 2009-09-07, updated 2009-09-08)

Lower Bounds for BMRM and Faster Rates for Training SVMs

Ankan Saha, Xinhua Zhang, S. V. N. Vishwanathan