arXiv:1912.00458 Abstract | arXiv Analytics

arXiv:1912.00458 [stat.ML]Abstract References Reviews Resources

On the optimality of kernels for high-dimensional clustering

Leena Chennuru Vankadara, Debarghya Ghoshdastidar

Published 2019-12-01Version 1

This paper studies the optimality of kernel methods in high-dimensional data clustering. Recent works have studied the large sample performance of kernel clustering in the high-dimensional regime, where Euclidean distance becomes less informative. However, it is unknown whether popular methods, such as kernel k-means, are optimal in this regime. We consider the problem of high-dimensional Gaussian clustering and show that, with the exponential kernel function, the sufficient conditions for partial recovery of clusters using the NP-hard kernel k-means objective matches the known information-theoretic limit up to a factor of $\sqrt{2}$ for large $k$. It also exactly matches the known upper bounds for the non-kernel setting. We also show that a semi-definite relaxation of the kernel k-means procedure matches up to constant factors, the spectral threshold, below which no polynomial-time algorithm is known to succeed. This is the first work that provides such optimality guarantees for the kernel k-means as well as its convex relaxation. Our proofs demonstrate the utility of the less known polynomial concentration results for random variables with exponentially decaying tails in a higher-order analysis of kernel methods.

Categories: stat.ML, cs.LG

Keywords: optimality, high-dimensional clustering, kernel k-means procedure matches, kernel methods, np-hard kernel k-means objective matches

Related articles: Most relevant | Search more

arXiv:2111.08308 [stat.ML] (Published 2021-11-16, updated 2022-06-02)

Learning with convolution and pooling operations in kernel methods

Theodor Misiakiewicz, Song Mei

arXiv:2304.13202 [stat.ML] (Published 2023-04-26)

Kernel Methods are Competitive for Operator Learning

Pau Batlle, Matthieu Darcy, Bamdad Hosseini, Houman Owhadi

arXiv:1709.07625 [stat.ML] (Published 2017-09-22)

Total stability of kernel methods

Andreas Christmann, Daohong Xiang, Ding-Xuan Zhou