arXiv:1702.06103 Abstract | arXiv Analytics

arXiv:1702.06103 [cs.LG]Abstract References Reviews Resources

An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

Published 2017-02-20Version 1

We present a new strategy for gap estimation in randomized algorithms for multiarmed bandits and combine it with the EXP3++ algorithm of Seldin and Slivkins (2014). In the stochastic regime the strategy reduces dependence of regret on a time horizon from $(\ln t)^3$ to $(\ln t)^2$ and replaces an additive factor of order $\Delta e^{1/\Delta^2}$ by an additive factor of order $1/\Delta^7$, where $\Delta$ is the minimal gap of a problem instance. In the adversarial regime regret guarantee remains unchanged.

Categories: cs.LG, stat.ML

Keywords: adversarial bandits, stochastic, adversarial regime regret guarantee remains, parametrization, strategy reduces dependence

Related articles: Most relevant | Search more

arXiv:1704.04470 [cs.LG] (Published 2017-04-14)

Lean From Thy Neighbor: Stochastic & Adversarial Bandits in a Network

L. Elisa Celis, Farnood Salehi

arXiv:1807.07623 [cs.LG] (Published 2018-07-19)

An Optimal Algorithm for Stochastic and Adversarial Bandits

Julian Zimmert, Yevgeny Seldin

arXiv:1910.06054 [cs.LG] (Published 2019-10-14)

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

Julian Zimmert, Yevgeny Seldin