arXiv Analytics

Sign in

arXiv:2008.02464 [stat.ML]AbstractReferencesReviewsResources

Concentration Bounds for Co-occurrence Matrices of Markov Chains

Jiezhong Qiu, Chi Wang, Ben Liao, Richard Peng, Jie Tang

Published 2020-08-06Version 1

Co-occurrence statistics for sequential data are common and important data signals in machine learning, which provide rich correlation and clustering information about the underlying object space. We give the first bound on the convergence rate of estimating the co-occurrence matrix of a regular (aperiodic and irreducible) finite Markov chain from a single random trajectory. Our work is motivated by the analysis of a well-known graph learning algorithm DeepWalk by [Qiu et al. WSDM '18], who study the convergence (in probability) of co-occurrence matrix from random walk on undirected graphs in the limit, but left the convergence rate an open problem. We prove a Chernoff-type bound for sums of matrix-valued random variables sampled via an ergodic Markov chain, generalizing the regular undirected graph case studied by [Garg et al. STOC '18]. Using the Chernoff-type bound, we show that given a regular Markov chain with $n$ states and mixing time $\tau$, we need a trajectory of length $O(\tau (\log{(n)}+\log{(\tau)})/\epsilon^2)$ to achieve an estimator of the co-occurrence matrix with error bound $\epsilon$. We conduct several experiments and the experimental results are consistent with the exponentially fast convergence rate from theoretical analysis. Our result gives the first sample complexity analysis in graph representation learning.

Related articles: Most relevant | Search more
arXiv:2505.11323 [stat.ML] (Published 2025-05-16)
Convergence Rates of Constrained Expected Improvement
arXiv:2208.07243 [stat.ML] (Published 2022-08-15)
Convergence Rates for Stochastic Approximation on a Boundary
arXiv:2010.13934 [stat.ML] (Published 2020-10-26)
A Homotopic Method to Solve the Lasso Problems with an Improved Upper Bound of Convergence Rate