{
  "id": "1310.7063",
  "version": "v3",
  "published": "2013-10-26T03:30:46.000Z",
  "updated": "2015-07-01T05:55:34.000Z",
  "title": "On the Convergence of Decentralized Gradient Descent",
  "authors": [
    "Kun Yuan",
    "Qing Ling",
    "Wotao Yin"
  ],
  "categories": [
    "math.OC"
  ],
  "abstract": "Consider the consensus problem of minimizing $f(x)=\\sum_{i=1}^n f_i(x)$ where each $f_i$ is only known to one individual agent $i$ out of a connected network of $n$ agents. All the agents shall collaboratively solve this problem and obtain the solution subject to data exchanges restricted to between neighboring agents. Such algorithms avoid the need of a fusion center, offer better network load balance, and improve data privacy. We study the decentralized gradient descent method in which each agent $i$ updates its variable $x_{(i)}$, which is a local approximate to the unknown variable $x$, by combining the average of its neighbors' with the negative gradient step $-\\alpha \\nabla f_i(x_{(i)})$. The iteration is $$x_{(i)}(k+1) \\gets \\sum_{\\text{neighbor} j \\text{of} i} w_{ij} x_{(j)}(k) - \\alpha \\nabla f_i(x_{(i)}(k)),\\quad\\text{for each agent} i,$$ where the averaging coefficients form a symmetric doubly stochastic matrix $W=[w_{ij}] \\in \\mathbb{R}^{n \\times n}$. We analyze the convergence of this iteration and derive its converge rate, assuming that each $f_i$ is proper closed convex and lower bounded, $\\nabla f_i$ is Lipschitz continuous with constant $L_{f_i}$, and stepsize $\\alpha$ is fixed. Provided that $\\alpha < O(1/L_h)$ where $L_h=\\max_i\\{L_{f_i}\\}$, the objective error at the averaged solution, $f(\\frac{1}{n}\\sum_i x_{(i)}(k))-f^*$, reduces at a speed of $O(1/k)$ until it reaches $O(\\alpha)$. If $f_i$ are further (restricted) strongly convex, then both $\\frac{1}{n}\\sum_i x_{(i)}(k)$ and each $x_{(i)}(k)$ converge to the global minimizer $x^*$ at a linear rate until reaching an $O(\\alpha)$-neighborhood of $x^*$. We also develop an iteration for decentralized basis pursuit and establish its linear convergence to an $O(\\alpha)$-neighborhood of the true unknown sparse signal.",
  "revisions": [
    {
      "version": "v2",
      "updated": "2014-02-09T01:16:24.000Z",
      "comment": null,
      "journal": null,
      "doi": null
    },
    {
      "version": "v3",
      "updated": "2015-07-01T05:55:34.000Z"
    }
  ],
  "analyses": {
    "keywords": [
      "convergence",
      "offer better network load balance",
      "true unknown sparse signal",
      "symmetric doubly stochastic matrix",
      "decentralized gradient descent method"
    ],
    "note": {
      "typesetting": "TeX",
      "pages": 0,
      "language": "en",
      "license": "arXiv",
      "status": "editable",
      "adsabs": "2013arXiv1310.7063Y"
    }
  }
}