arXiv Analytics

Sign in

arXiv:1311.0468 [stat.ML]AbstractReferencesReviewsResources

Thompson Sampling for Online Learning with Linear Experts

Aditya Gopalan

Published 2013-11-03Version 1

In this note, we present a version of the Thompson sampling algorithm for the problem of online linear generalization with full information (i.e., the experts setting), studied by Kalai and Vempala, 2005. The algorithm uses a Gaussian prior and time-varying Gaussian likelihoods, and we show that it essentially reduces to Kalai and Vempala's Follow-the-Perturbed-Leader strategy, with exponentially distributed noise replaced by Gaussian noise. This implies sqrt(T) regret bounds for Thompson sampling (with time-varying likelihood) for online learning with full information.

Related articles: Most relevant | Search more
arXiv:1208.3728 [stat.ML] (Published 2012-08-18, updated 2014-05-24)
Online Learning with Predictable Sequences
arXiv:1807.09089 [stat.ML] (Published 2018-07-24)
Decision Variance in Online Learning
arXiv:1810.02567 [stat.ML] (Published 2018-10-05)
Online Learning to Rank with Features