{ "id": "2402.05715", "version": "v1", "published": "2024-02-08T14:43:56.000Z", "updated": "2024-02-08T14:43:56.000Z", "title": "Collaborative non-parametric two-sample testing", "authors": [ "Alejandro de la Concha", "Nicolas Vayatis", "Argyris Kalogeratos" ], "categories": [ "stat.ML", "cs.LG" ], "abstract": "This paper addresses the multiple two-sample test problem in a graph-structured setting, which is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions (pdfs), $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.", "revisions": [ { "version": "v1", "updated": "2024-02-08T14:43:56.000Z" } ], "analyses": { "keywords": [ "collaborative non-parametric two-sample testing", "state-of-the-art non-parametric statistical tests", "network detecting seismic activity", "outperforms state-of-the-art non-parametric statistical", "ctst outperforms state-of-the-art non-parametric" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }