Utility-efficient differentially private K-means clustering based on cluster merging

  • Tianjiao Ni
  • , Minghao Qiao
  • , Zhili Chen*
  • , Shun Zhang
  • , Hong Zhong
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

38 Scopus citations

Abstract

Differential privacy is widely used in data analysis. State-of-the-art k-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a novel differentially private k-means clustering algorithm, DP-KCCM, that significantly improves the utility of clustering by adding adaptive noise and merging clusters. Specifically, to obtain k clusters with differential privacy, the algorithm first generates n×k initial centroids, adds adaptive noise for each iteration to get n×k clusters, and finally merges these clusters into k ones. We theoretically prove the differential privacy of the proposed algorithm. Surprisingly, extensive experimental results show that: 1) cluster merging with equal amounts of noise improves the utility somewhat; 2) while adding adaptive noise only does not improve the utility, combining both cluster merging and adaptive noise further improves the utility significantly.

Original languageEnglish
Pages (from-to)205-214
Number of pages10
JournalNeurocomputing
Volume424
DOIs
StatePublished - 1 Feb 2021
Externally publishedYes

Keywords

  • Cluster
  • Differential privacy
  • K-means

Fingerprint

Dive into the research topics of 'Utility-efficient differentially private K-means clustering based on cluster merging'. Together they form a unique fingerprint.

Cite this