arXiv:1311.0468 Abstract | arXiv Analytics

arXiv:1311.0468 [stat.ML]Abstract References Reviews Resources

Thompson Sampling for Online Learning with Linear Experts

Published 2013-11-03Version 1

In this note, we present a version of the Thompson sampling algorithm for the problem of online linear generalization with full information (i.e., the experts setting), studied by Kalai and Vempala, 2005. The algorithm uses a Gaussian prior and time-varying Gaussian likelihoods, and we show that it essentially reduces to Kalai and Vempala's Follow-the-Perturbed-Leader strategy, with exponentially distributed noise replaced by Gaussian noise. This implies sqrt(T) regret bounds for Thompson sampling (with time-varying likelihood) for online learning with full information.

Categories: stat.ML, cs.LG

Keywords: online learning, linear experts, full information, vempalas follow-the-perturbed-leader strategy, online linear generalization

Related articles: Most relevant | Search more

arXiv:1208.3728 [stat.ML] (Published 2012-08-18, updated 2014-05-24)

Online Learning with Predictable Sequences

Alexander Rakhlin, Karthik Sridharan

arXiv:1807.09089 [stat.ML] (Published 2018-07-24)

Decision Variance in Online Learning

Sattar Vakili, Alexis Boukouvalas

arXiv:1810.02567 [stat.ML] (Published 2018-10-05)

Online Learning to Rank with Features

Shuai Li, Tor Lattimore, Csaba Szepesvári