arXiv Analytics

Sign in

arXiv:2209.05812 [stat.ML]AbstractReferencesReviewsResources

Addressing overfitting in spectral clustering via a non-parametric bootstrap

Liam Welsh, Phillip Shreeves

Published 2022-09-13Version 1

Finite mixture modelling is a popular method in the field of clustering and is beneficial largely due to its soft cluster membership probabilities. However, the most common algorithm for fitting finite mixture models, the EM algorithm, falls victim to a number of issues. We address these issues that plague clustering using finite mixture models, including convergence to solutions corresponding to local maxima and algorithm speed concerns in high dimensional cases. This is done by developing two novel algorithms that incorporate a spectral decomposition of the data matrix and a non-parametric bootstrap sampling scheme. Simulations show the validity of our algorithms and demonstrate not only their flexibility but also their ability to avoid solutions corresponding to local-maxima, when compared to other (bootstrapped) clustering algorithms for estimating finite mixture models. Our novel algorithms have a typically more consistent convergence criteria as well as a significant increase in speed over other bootstrapped algorithms that fit finite mixture models.

Related articles: Most relevant | Search more
arXiv:2305.11766 [stat.ML] (Published 2023-05-19)
Transfer operators on graphs: Spectral clustering and beyond
arXiv:2108.08687 [stat.ML] (Published 2021-08-18)
Clustering dynamics on graphs: from spectral clustering to mean shift through Fokker-Planck interpolation
arXiv:2012.04646 [stat.ML] (Published 2020-12-07)
Spectral clustering via adaptive layer aggregation for multi-layer networks