arXiv Analytics

Sign in

arXiv:2104.13968 [cs.LG]AbstractReferencesReviewsResources

Tail-Net: Extracting Lowest Singular Triplets for Big Data Applications

Gurpreet Singh, Soumyajit Gupta

Published 2021-04-28Version 1

SVD serves as an exploratory tool in identifying the dominant features in the form of top rank-r singular factors corresponding to the largest singular values. For Big Data applications it is well known that Singular Value Decomposition (SVD) is restrictive due to main memory requirements. However, a number of applications such as community detection, clustering, or bottleneck identification in large scale graph data-sets rely upon identifying the lowest singular values and the singular corresponding vectors. For example, the lowest singular values of a graph Laplacian reveal the number of isolated clusters (zero singular values) or bottlenecks (lowest non-zero singular values) for undirected, acyclic graphs. A naive approach here would be to perform a full SVD however, this quickly becomes infeasible for practical big data applications due to the enormous memory requirements. Furthermore, for such applications only a few lowest singular factors are desired making a full decomposition computationally exorbitant. In this work, we trivially extend the previously proposed Range-Net to \textbf{Tail-Net} for a memory and compute efficient extraction of lowest singular factors of a given big dataset and a specified rank-r. We present a number of numerical experiments on both synthetic and practical data-sets for verification and bench-marking using conventional SVD as the baseline.

Related articles: Most relevant | Search more
arXiv:2403.11395 [cs.LG] (Published 2024-03-18)
Automated data processing and feature engineering for deep learning and big data applications: a survey
arXiv:2310.09819 [cs.LG] (Published 2023-10-15)
Optimizing K-means for Big Data: A Comparative Study
arXiv:1509.08062 [cs.LG] (Published 2015-09-27)
End-to-End Text-Dependent Speaker Verification