Panda: partially approximate newton methods for distributed minimax optimization with unbalanced dimensions

Minheng Xiao, Chengchang Liu*, Cheng Chen, John C.S. Lui, Sen Na

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Unbalanced dimensions are crucial characteristics in various minimax optimization problems, such as few-shot learning (Cortes and Mohri in Adv Neural Inf Process Syst 16, 2003; Ying et al. in Adv Neural Inf Process Syst 29, 2016) and fairness-aware machine learning (Lowd and Meek, in: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining, 2005; Zhang et al., in: Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, 2018). In this paper, we propose a communication-efficient second-order method named PANDA (Partially Approximate Newton methods for Distributed minimAx) to solve problems with unbalanced dimensions. PANDA requires almost the same per-iteration communication cost as the first-order methods by utilizing the special problem structure in its design for data exchange between the client and server. More importantly, it exhibits a superior linear-quadratic convergence rate and significantly reduces the total number of communication rounds through the efficient use of second-order information. We also develop GIANT-PANDA based on the framework of PANDA, which further reduces the computation cost of the latter one by performing sketching operations on each client. Through comprehensive theoretical analysis and empirical evaluations, we demonstrate the superior performance of the proposed methods compared to existing state-of-the-art methods.

Original languageEnglish
Article number174
JournalMachine Learning
Volume114
Issue number8
DOIs
StatePublished - Aug 2025

Keywords

  • Distributed optimization
  • Fairness-aware machine learning
  • Few-shot learning
  • Minimax optimization

Fingerprint

Dive into the research topics of 'Panda: partially approximate newton methods for distributed minimax optimization with unbalanced dimensions'. Together they form a unique fingerprint.

Cite this