{ "id": "1310.7063", "version": "v3", "published": "2013-10-26T03:30:46.000Z", "updated": "2015-07-01T05:55:34.000Z", "title": "On the Convergence of Decentralized Gradient Descent", "authors": [ "Kun Yuan", "Qing Ling", "Wotao Yin" ], "categories": [ "math.OC" ], "abstract": "Consider the consensus problem of minimizing $f(x)=\\sum_{i=1}^n f_i(x)$ where each $f_i$ is only known to one individual agent $i$ out of a connected network of $n$ agents. All the agents shall collaboratively solve this problem and obtain the solution subject to data exchanges restricted to between neighboring agents. Such algorithms avoid the need of a fusion center, offer better network load balance, and improve data privacy. We study the decentralized gradient descent method in which each agent $i$ updates its variable $x_{(i)}$, which is a local approximate to the unknown variable $x$, by combining the average of its neighbors' with the negative gradient step $-\\alpha \\nabla f_i(x_{(i)})$. The iteration is $$x_{(i)}(k+1) \\gets \\sum_{\\text{neighbor} j \\text{of} i} w_{ij} x_{(j)}(k) - \\alpha \\nabla f_i(x_{(i)}(k)),\\quad\\text{for each agent} i,$$ where the averaging coefficients form a symmetric doubly stochastic matrix $W=[w_{ij}] \\in \\mathbb{R}^{n \\times n}$. We analyze the convergence of this iteration and derive its converge rate, assuming that each $f_i$ is proper closed convex and lower bounded, $\\nabla f_i$ is Lipschitz continuous with constant $L_{f_i}$, and stepsize $\\alpha$ is fixed. Provided that $\\alpha < O(1/L_h)$ where $L_h=\\max_i\\{L_{f_i}\\}$, the objective error at the averaged solution, $f(\\frac{1}{n}\\sum_i x_{(i)}(k))-f^*$, reduces at a speed of $O(1/k)$ until it reaches $O(\\alpha)$. If $f_i$ are further (restricted) strongly convex, then both $\\frac{1}{n}\\sum_i x_{(i)}(k)$ and each $x_{(i)}(k)$ converge to the global minimizer $x^*$ at a linear rate until reaching an $O(\\alpha)$-neighborhood of $x^*$. We also develop an iteration for decentralized basis pursuit and establish its linear convergence to an $O(\\alpha)$-neighborhood of the true unknown sparse signal.", "revisions": [ { "version": "v2", "updated": "2014-02-09T01:16:24.000Z", "comment": null, "journal": null, "doi": null }, { "version": "v3", "updated": "2015-07-01T05:55:34.000Z" } ], "analyses": { "keywords": [ "convergence", "offer better network load balance", "true unknown sparse signal", "symmetric doubly stochastic matrix", "decentralized gradient descent method" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable", "adsabs": "2013arXiv1310.7063Y" } } }