arXiv:2403.15038 Abstract | arXiv Analytics

arXiv:2403.15038 [stat.ML]Abstract References Reviews Resources

Estimation of multiple mean vectors in high dimension

Gilles Blanchard, Jean-Baptiste Fermanian, Hannah Marienwald

Published 2024-03-22Version 1

We endeavour to estimate numerous multi-dimensional means of various probability distributions on a common space based on independent samples. Our approach involves forming estimators through convex combinations of empirical means derived from these samples. We introduce two strategies to find appropriate data-dependent convex combination weights: a first one employing a testing procedure to identify neighbouring means with low variance, which results in a closed-form plug-in formula for the weights, and a second one determining weights via minimization of an upper confidence bound on the quadratic risk.Through theoretical analysis, we evaluate the improvement in quadratic risk offered by our methods compared to the empirical means. Our analysis focuses on a dimensional asymptotics perspective, showing that our methods asymptotically approach an oracle (minimax) improvement as the effective dimension of the data increases.We demonstrate the efficacy of our methods in estimating multiple kernel mean embeddings through experiments on both simulated and real-world datasets.

Categories: stat.ML, cs.LG

Keywords: multiple mean vectors, high dimension, multiple kernel mean embeddings, appropriate data-dependent convex combination weights, quadratic risk

Related articles: Most relevant | Search more

arXiv:1511.03688 [stat.ML] (Published 2015-11-11)

Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?

Hervé Cardot, David Degras

arXiv:2410.09973 [stat.ML] (Published 2024-10-13)

Gradient Span Algorithms Make Predictable Progress in High Dimension

Felix Benning, Leif Döring

arXiv:2112.14233 [stat.ML] (Published 2021-12-28, updated 2022-02-15)

Learning Across Bandits in High Dimension via Robust Statistics

Kan Xu, Hamsa Bastani