arXiv:1802.04784 Abstract | arXiv Analytics

arXiv:1802.04784 [stat.ML]Abstract References Reviews Resources

MONK -- Outlier-Robust Mean Embedding Estimation by Median-of-Means

Matthieu Lerasle, Zoltan Szabo, Gaspar Massiot, Eric Moulines

Published 2018-02-13Version 1

Mean embeddings provide an extremely flexible and powerful tool in machine learning and statistics to represent probability distributions and define a semi-metric (MMD, maximum mean discrepancy; also called N-distance or energy distance), with numerous successful applications. The representation is constructed as the expectation of the feature map defined by a kernel. As a mean, its classical empirical estimator, however, can be arbitrary severely affected even by a single outlier in case of unbounded features. To the best of our knowledge, unfortunately even the consistency of the existing few techniques trying to alleviate this serious sensitivity bottleneck is unknown. In this paper, we show how the recently emerged principle of median-of-means can be used to design minimax-optimal estimators for kernel mean embedding and MMD, with finite-sample strong outlier-robustness guarantees.

Comments: 11 pages

Categories: stat.ML, cs.IT, math.FA, math.IT, math.ST, stat.TH

Subjects: 46E22, 94A15, 62Gxx, 47B32, G.3, H.1.1, I.2.6

Keywords: outlier-robust mean embedding estimation, median-of-means, finite-sample strong outlier-robustness guarantees, represent probability distributions, design minimax-optimal estimators

Related articles: Most relevant | Search more

arXiv:2302.09930 [stat.ML] (Published 2023-02-20)

Nyström $M$-Hilbert-Schmidt Independence Criterion

Florian Kalinke, Zoltán Szabó

arXiv:2006.05240 [stat.ML] (Published 2020-06-09)

How Robust is the Median-of-Means? Concentration Bounds in Presence of Outliers

Pierre Laforgue, Guillaume Staerman, Stephan Clémençon

arXiv:2105.14035 [stat.ML] (Published 2021-05-28)