{ "id": "1703.06700", "version": "v1", "published": "2017-03-20T12:09:53.000Z", "updated": "2017-03-20T12:09:53.000Z", "title": "Independence clustering (without a matrix)", "authors": [ "Daniil Ryabko" ], "categories": [ "cs.LG", "cs.IT", "math.IT", "stat.ML" ], "abstract": "The independence clustering problem is considered in the following formulation: given a set $S$ of random variables, it is required to find the finest partitioning $\\{U_1,\\dots,U_k\\}$ of $S$ into clusters such that the clusters $U_1,\\dots,U_k$ are mutually independent. Since mutual independence is the target, pairwise similarity measurements are of no use, and thus traditional clustering algorithms are inapplicable. The distribution of the random variables in $S$ is, in general, unknown, but a sample is available. Thus, the problem is cast in terms of time series. Two forms of sampling are considered: i.i.d.\\ and stationary time series, with the main emphasis being on the latter, more general, case. A consistent, computationally tractable algorithm for each of the settings is proposed, and a number of open directions for further research are outlined.", "revisions": [ { "version": "v1", "updated": "2017-03-20T12:09:53.000Z" } ], "analyses": { "keywords": [ "random variables", "stationary time series", "mutual independence", "pairwise similarity measurements", "independence clustering problem" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }