arXiv Analytics

Sign in

arXiv:1806.05358 [cs.LG]AbstractReferencesReviewsResources

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

Published 2018-06-14Version 1

In this paper, we study robust large-scale distributed learning in the presence of saddle points in non-convex loss functions. We consider the Byzantine setting where some worker machines may have abnormal or even arbitrary and adversarial behavior. We argue that in the Byzantine setting, optimizing a non-convex function and escaping saddle points become much more challenging, even when robust gradient estimators are used. We develop ByzantinePGD, a robust and communication-efficient algorithm that can provably escape saddle points and converge to approximate local minimizers. The iteration complexity of our algorithm in the Byzantine setting matches that of standard gradient descent in the usual setting. We further provide three robust aggregation subroutines that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering. We characterize their performance in statistical settings, and argue for their near-optimality in different regimes including the high dimensional setting.

Related articles:
arXiv:1911.09721 [cs.LG] (Published 2019-11-21)
Communication-Efficient and Byzantine-Robust Distributed Learning
arXiv:2305.13856 [cs.LG] (Published 2023-05-23)
On the Optimal Batch Size for Byzantine-Robust Distributed Learning
arXiv:1803.01498 [cs.LG] (Published 2018-03-05)
Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates