arXiv:2302.06025 Abstract | arXiv Analytics

arXiv:2302.06025 [stat.ML]Abstract References Reviews Resources

Beyond UCB: Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran

Published 2023-02-12Version 1

We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action. Compared with the linear model, two curious phenomena arise in non-linear models: first, in addition to the "learning phase" with a standard parametric rate for estimation or regret, there is an "burn-in period" with a fixed cost determined by the non-linear function; second, achieving the smallest burn-in cost requires new exploration algorithms. For a special family of non-linear functions named ridge functions in the literature, we derive upper and lower bounds on the optimal burn-in cost, and in addition, on the entire learning trajectory during the burn-in period via differential equations. In particular, a two-stage algorithm that first finds a good initial action and then treats the problem as locally linear is statistically optimal. In contrast, several classical algorithms, such as UCB and algorithms relying on regression oracles, are provably suboptimal.

Categories: stat.ML, cs.IT, cs.LG, math.IT, math.ST, stat.TH

Keywords: non-linear ridge bandits, optimal algorithms, statistical complexity, non-linear functions named ridge functions, burn-in period

Related articles: Most relevant | Search more

arXiv:2002.00189 [stat.ML] (Published 2020-02-01)

The Statistical Complexity of Early Stopped Mirror Descent

Tomas Vaškevičius, Varun Kanade, Patrick Rebeschini

arXiv:1610.02918 [stat.ML] (Published 2016-10-10)

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

Thibault Lesieur, Caterina De Bacco, Jess Banks, Florent Krzakala, Cris Moore, Lenka Zdeborová

arXiv:2103.13059 [stat.ML] (Published 2021-03-24)

Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Wei Huang, Richard Combes, Cindy Trinh