arXiv Analytics

Sign in

arXiv:1401.0247 [cs.LG]AbstractReferencesReviewsResources

Robust Hierarchical Clustering

Maria-Florina Balcan, Yingyu Liang, Pramod Gupta

Published 2014-01-01, updated 2014-07-13Version 2

One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across many different fields ranging from computational biology to social sciences to computer vision in part because their output is easy to interpret. Unfortunately, it is well known, however, that many of the classic agglomerative clustering algorithms are not robust to noise. In this paper we propose and analyze a new robust algorithm for bottom-up agglomerative clustering. We show that our algorithm can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also show how to adapt our algorithm to the inductive setting where our given data is only a small random sample of the entire data set. Experimental evaluations on synthetic and real world data sets show that our algorithm achieves better performance than other hierarchical algorithms in the presence of noise.

Related articles: Most relevant | Search more
arXiv:2108.07247 [cs.LG] (Published 2021-08-16)
Robust Hierarchical Clustering for Directed Networks: An Axiomatic Approach
arXiv:2108.07926 [cs.LG] (Published 2021-08-18)
Learning to Collaborate
arXiv:2302.00192 [cs.LG] (Published 2023-02-01)
Density peak clustering using tensor network