Decentralized Local Updates with Dual-Slow Estimation and Momentum-Based Variance-Reduction for Non-Convex Optimization

  • Kangyang Luo
  • , Kunkun Zhang
  • , Shengbo Zhang
  • , Xiang Li*
  • , Ming Gao
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Decentralized learning (DL) has recently employed local updates to reduce the communication cost for general non-convex optimization problems. Specifically, local updates require each node to perform multiple update steps on the parameters of the local model before communicating with others. However, most existing methods could be highly sensitive to data heterogeneity (i.e., non-iid data distribution) and adversely affected by the stochastic gradient noise. In this paper, we propose DSE-MVR to address these problems. Specifically, DSE-MVR introduces a dual-slow estimation strategy that utilizes the gradient tracking technique to estimate the global accumulated update direction for handling the data heterogeneity problem; also for stochastic noise, the method uses the mini-batch momentum-based variance-reduction technique. We theoretically prove that DSE-MVR can achieve optimal convergence results for general non-convex optimization in both iid and non-iid data distribution settings. In particular, the leading terms in the convergence rates derived by DSE-MVR are independent of the stochastic noise for large-batches or large partial average intervals (i.e., the number of local update steps). Further, we put forward DSE-SGD and theoretically justify the importance of the dual-slow estimation strategy in the data heterogeneity setting. Finally, we conduct extensive experiments to show the superiority of DSE-MVR against other state-of-the-art approaches. We provide our code here: https://anonymous.4open.science/r/DSE-MVR-32B8/.

Original languageEnglish
Title of host publicationECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings
EditorsKobi Gal, Kobi Gal, Ann Nowe, Grzegorz J. Nalepa, Roy Fairstein, Roxana Radulescu
PublisherIOS Press BV
Pages1625-1632
Number of pages8
ISBN (Electronic)9781643684369
DOIs
StatePublished - 28 Sep 2023
Event26th European Conference on Artificial Intelligence, ECAI 2023 - Krakow, Poland
Duration: 30 Sep 20234 Oct 2023

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume372
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference26th European Conference on Artificial Intelligence, ECAI 2023
Country/TerritoryPoland
CityKrakow
Period30/09/234/10/23

Fingerprint

Dive into the research topics of 'Decentralized Local Updates with Dual-Slow Estimation and Momentum-Based Variance-Reduction for Non-Convex Optimization'. Together they form a unique fingerprint.

Cite this