{
  "id": "2212.12674",
  "version": "v1",
  "published": "2022-12-24T07:15:00.000Z",
  "updated": "2022-12-24T07:15:00.000Z",
  "title": "Data-Driven Linear Complexity Low-Rank Approximation of General Kernel Matrices: A Geometric Approach",
  "authors": [
    "Difeng Cai",
    "Edmond Chow",
    "Yuanzhe Xi"
  ],
  "categories": [
    "math.NA",
    "cs.LG",
    "cs.NA"
  ],
  "abstract": "A general, {\\em rectangular} kernel matrix may be defined as $K_{ij} = \\kappa(x_i,y_j)$ where $\\kappa(x,y)$ is a kernel function and where $X=\\{x_i\\}_{i=1}^m$ and $Y=\\{y_i\\}_{i=1}^n$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large and are not well-separated (e.g., the points in $X$ and $Y$ may be ``intermingled''). Such rectangular kernel matrices may arise, for example, in Gaussian process regression where $X$ corresponds to the training data and $Y$ corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly, i.e., with computational complexity $O(m)$ or $O(n)$ for a fixed accuracy or rank. The main idea in this paper is to {\\em geometrically} select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.",
  "revisions": [
    {
      "version": "v1",
      "updated": "2022-12-24T07:15:00.000Z"
    }
  ],
  "analyses": {
    "keywords": [
      "data-driven linear complexity low-rank approximation",
      "kernel matrix",
      "general kernel matrices",
      "geometric approach",
      "kernel function"
    ],
    "note": {
      "typesetting": "TeX",
      "pages": 0,
      "language": "en",
      "license": "arXiv",
      "status": "editable"
    }
  }
}