arXiv Analytics

Sign in

arXiv:1403.5341 [cs.LG]AbstractReferencesReviewsResources

An Information-Theoretic Analysis of Thompson Sampling

Daniel Russo, Benjamin Van Roy

Published 2014-03-21, updated 2015-06-08Version 2

We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and elegance of information theory and leads to regret bounds that scale with the entropy of the optimal-action distribution. This strengthens preexisting results and yields new insight into how information improves performance.

Related articles: Most relevant | Search more
arXiv:1803.04623 [cs.LG] (Published 2018-03-13)
Thompson Sampling for Combinatorial Semi-Bandits
arXiv:1811.04471 [cs.LG] (Published 2018-11-11)
Thompson Sampling for Pursuit-Evasion Problems
arXiv:2209.08197 [cs.LG] (Published 2022-09-16)
Thompson Sampling with Virtual Helping Agents