{ "id": "2212.12674", "version": "v1", "published": "2022-12-24T07:15:00.000Z", "updated": "2022-12-24T07:15:00.000Z", "title": "Data-Driven Linear Complexity Low-Rank Approximation of General Kernel Matrices: A Geometric Approach", "authors": [ "Difeng Cai", "Edmond Chow", "Yuanzhe Xi" ], "categories": [ "math.NA", "cs.LG", "cs.NA" ], "abstract": "A general, {\\em rectangular} kernel matrix may be defined as $K_{ij} = \\kappa(x_i,y_j)$ where $\\kappa(x,y)$ is a kernel function and where $X=\\{x_i\\}_{i=1}^m$ and $Y=\\{y_i\\}_{i=1}^n$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large and are not well-separated (e.g., the points in $X$ and $Y$ may be ``intermingled''). Such rectangular kernel matrices may arise, for example, in Gaussian process regression where $X$ corresponds to the training data and $Y$ corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly, i.e., with computational complexity $O(m)$ or $O(n)$ for a fixed accuracy or rank. The main idea in this paper is to {\\em geometrically} select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.", "revisions": [ { "version": "v1", "updated": "2022-12-24T07:15:00.000Z" } ], "analyses": { "keywords": [ "data-driven linear complexity low-rank approximation", "kernel matrix", "general kernel matrices", "geometric approach", "kernel function" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }