arXiv Analytics

Sign in

arXiv:1610.02918 [stat.ML]AbstractReferencesReviewsResources

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

Thibault Lesieur, Caterina De Bacco, Jess Banks, Florent Krzakala, Cris Moore, Lenka Zdeborová

Published 2016-10-10Version 1

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m \rightarrow \infty$ and $\alpha = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

Related articles: Most relevant | Search more
arXiv:2010.07955 [stat.ML] (Published 2020-10-15)
Cascade of Phase Transitions for Multi-Scale Clustering
arXiv:2302.06025 [stat.ML] (Published 2023-02-12)
Beyond UCB: Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits
arXiv:2205.13527 [stat.ML] (Published 2022-05-26)
Subspace clustering in high-dimensions: Phase transitions \& Statistical-to-Computational gap