arXiv:2112.14233 Abstract | arXiv Analytics

arXiv:2112.14233 [stat.ML]Abstract References Reviews Resources

Learning Across Bandits in High Dimension via Robust Statistics

Published 2021-12-28, updated 2022-02-15Version 2

Decision-makers often face the "many bandits" problem, where one must simultaneously learn across related but heterogeneous contextual bandit instances. For instance, a large retailer may wish to dynamically learn product demand across many stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to dynamically learn patient risk across many providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. We study the setting where the unknown parameter in each bandit instance can be decomposed into a global parameter plus a sparse instance-specific term. Then, we propose a novel two-stage estimator that exploits this structure in a sample-efficient way by using a combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). We embed this estimator within a bandit algorithm, and prove that it improves asymptotic regret bounds in the context dimension $d$; this improvement is exponential for data-poor instances. We further demonstrate how our results depend on the underlying network structure of bandit instances. Finally, we illustrate the value of our approach on synthetic and real datasets.

Categories: stat.ML, cs.LG, math.ST, stat.TH

Keywords: robust statistics, high dimension, hospitals serving similar patient populations, contextual bandit instances, dynamically learn product demand

Related articles: Most relevant | Search more

arXiv:2109.04433 [stat.ML] (Published 2021-09-09)

Extreme Bandits using Robust Statistics

Sujay Bhatt, Ping Li, Gennady Samorodnitsky

arXiv:1707.09752 [stat.ML] (Published 2017-07-31)

Anomaly Detection by Robust Statistics

Peter J. Rousseeuw, Mia Hubert

arXiv:2403.15038 [stat.ML] (Published 2024-03-22)

Estimation of multiple mean vectors in high dimension

Gilles Blanchard, Jean-Baptiste Fermanian, Hannah Marienwald