arXiv:1601.03466 [cs.LG]AbstractReferencesReviewsResources
Dynamic Privacy For Distributed Machine Learning Over Network
Published 2016-01-14Version 1
Privacy-preserving distributed machine learning becomes increasingly important due to the rapid growth of amount of data and the importance of distributed learning. This paper develops algorithms to provide privacy-preserving learning for classification problem using the regularized empirical risk minimization (ERM) objective function in a distributed fashion. We use the definition of differential privacy, developed by Dwork et al. privacy to capture the notion of privacy of our algorithm. We provide two methods. We first propose the dual variable perturbation} which perturbs the dual variable before next intermediate minimization of augmented Lagrange function over the classifier in every ADMM iteration. In the second method, we apply the output perturbation to the primal variable before releasing it to neighboring nodes. We call the second method primal variable perturbation. Under certain conditions on the convexity and differentiability of the loss function and regularizer, our algorithms is proved to provide differential privacy through the entire learning process. We also provide theoretical results for the accuracy of the algorithm, and prove that both algorithms converges in distribution. The theoretical results show that the dual variable perturbation outperforms the primal case. The tradeoff between privacy and accuracy is examined in the numerical experiment. Our experiment shows that both algorithms performs similar in managing the privacy-accuracy tradeoff, and primal variable perturbaiton is slightly better than the dual case.