arXiv Analytics

Sign in

arXiv:1112.5391 [cond-mat.stat-mech]AbstractReferencesReviewsResources

Number of relevant directions in Principal Component Analysis and Wishart random matrices

Satya N. Majumdar, Pierpaolo Vivo

Published 2011-12-22Version 1

We compute analytically, for large $N$, the probability $\mathcal{P}(N_+,N)$ that a $N\times N$ Wishart random matrix has $N_+$ eigenvalues exceeding a threshold $N\zeta$, including its large deviation tails. This probability plays a benchmark role when performing the Principal Component Analysis of a large empirical dataset. We find that $\mathcal{P}(N_+,N)\approx\exp(-\beta N^2 \psi_\zeta(N_+/N))$, where $\beta$ is the Dyson index of the ensemble and $\psi_\zeta(\kappa)$ is a rate function that we compute explicitly in the full range $0\leq \kappa\leq 1$ and for any $\zeta$. The rate function $\psi_\zeta(\kappa)$ displays a quadratic behavior modulated by a logarithmic singularity close to its minimum $\kappa^\star(\zeta)$. This is shown to be a consequence of a phase transition in an associated Coulomb gas problem. The variance $\Delta(N)$ of the number of relevant components is also shown to grow universally (independent of $\zeta)$ as $\Delta(N)\sim (\beta \pi^2)^{-1}\ln N$ for large $N$.

Comments: 5 pag., 2 fig
Journal: Phys. Rev. Lett. 108, 200601 (2012)
Related articles: Most relevant | Search more
arXiv:0903.1494 [cond-mat.stat-mech] (Published 2009-03-09, updated 2009-06-12)
Non-intersecting Brownian Interfaces and Wishart Random Matrices
arXiv:cond-mat/0701371 (Published 2007-01-16, updated 2007-05-09)
Large Deviations of the Maximum Eigenvalue in Wishart Random Matrices
arXiv:1104.3665 [cond-mat.stat-mech] (Published 2011-04-19)
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections