arXiv Analytics

Sign in

arXiv:1904.07272 [cs.LG]AbstractReferencesReviewsResources

Introduction to Multi-Armed Bandits

Aleksandrs Slivkins

Published 2019-04-15Version 1

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a review of the more advanced results. The chapters are as follows: Stochastic bandits; Lower bounds; Bayesian Bandits and Thompson Sampling; Lipschitz Bandits; Full Feedback and Adversarial Costs; Adversarial Bandits; Linear Costs and Semi-bandits; Contextual Bandits; Bandits and Zero-Sum Games; Bandits with Knapsacks; Incentivized Exploration and Connections to Mechanism Design. Status of the manuscript: essentially complete (modulo some polishing), except for last chapter, which the author plans to add over the next few months.

Related articles: Most relevant | Search more
arXiv:1307.5438 [cs.LG] (Published 2013-07-20, updated 2014-10-05)
Towards Distribution-Free Multi-Armed Bandits with Combinatorial Strategies
arXiv:2006.06856 [cs.LG] (Published 2020-06-11)
Bandit-PAM: Almost Linear Time $k$-Medoids Clustering via Multi-Armed Bandits
arXiv:0909.1334 [cs.LG] (Published 2009-09-07, updated 2009-09-08)
Lower Bounds for BMRM and Faster Rates for Training SVMs