arXiv:1403.5341 [cs.LG]AbstractReferencesReviewsResources
An Information-Theoretic Analysis of Thompson Sampling
Daniel Russo, Benjamin Van Roy
Published 2014-03-21, updated 2015-06-08Version 2
We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and elegance of information theory and leads to regret bounds that scale with the entropy of the optimal-action distribution. This strengthens preexisting results and yields new insight into how information improves performance.
Categories: cs.LG
Related articles: Most relevant | Search more
arXiv:1803.04623 [cs.LG] (Published 2018-03-13)
Thompson Sampling for Combinatorial Semi-Bandits
arXiv:1811.04471 [cs.LG] (Published 2018-11-11)
Thompson Sampling for Pursuit-Evasion Problems
arXiv:2209.08197 [cs.LG] (Published 2022-09-16)
Thompson Sampling with Virtual Helping Agents