FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering

  • Md Sirajul Islam
  • , Simin Javaherian
  • , Fei Xu
  • , Xu Yuan
  • , Li Chen
  • , Nian Feng Tzeng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative training of machine learning models over decentralized devices without exposing their local data. One of the major challenges in FL is the presence of uneven data distributions across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. To address the performance degradation issue incurred by such data heterogeneity, clustered federated learning (CFL) shows its promise by grouping clients into separate learning clusters based on the similarity of their local data distributions. However, state-of-the-art CFL approaches require a large number of communication rounds to learn the distribution similarities during training until the formation of clusters is stabilized. Moreover, some of these algorithms heavily rely on a predefined number of clusters, thus limiting their flexibility and adaptability. In this paper, we propose FedClust, a novel approach for CFL that leverages the correlation between local model weights and the data distribution of clients. FedClust groups clients into clusters in a one-shot manner by measuring the similarity degrees among clients based on the strategically selected partial weights of locally trained models. We conduct extensive experiments on four benchmark datasets with different non-IID data settings. Experimental results demonstrate that FedClust achieves higher model accuracy up to ~45% as well as faster convergence with a significantly reduced communication cost up to 2.7 × compared to its state-of-the-art counterparts.

Original languageEnglish
Title of host publication53rd International Conference on Parallel Processing, ICPP 2024 - Main Conference Proceedings
PublisherAssociation for Computing Machinery
Pages474-483
Number of pages10
ISBN (Electronic)9798400708428
DOIs
StatePublished - 12 Aug 2024
Event53rd International Conference on Parallel Processing, ICPP 2024 - Gotland, Sweden
Duration: 12 Aug 202415 Aug 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference53rd International Conference on Parallel Processing, ICPP 2024
Country/TerritorySweden
CityGotland
Period12/08/2415/08/24

Keywords

  • Clustered Federated Learning
  • Federated Learning
  • Non-IID Data

Fingerprint

Dive into the research topics of 'FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering'. Together they form a unique fingerprint.

Cite this