TY - GEN
T1 - Coresets for Deletion-Robust k-Center Clustering
AU - Li, Ruien
AU - Wang, Yanhao
AU - Mathioudakis, Michael
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/21
Y1 - 2024/10/21
N2 - The k-center clustering problem is of fundamental importance for a broad range of machine learning and data science applications. In this paper, we study the deletion-robust version of the problem. Specifically, we aim to extract a small subset of a given data set, referred to as a coreset, that contains a provably good set of k centers even after an adversary deletes up to z arbitrarily chosen points from the data set. We propose a 4-approximation algorithm that provides a coreset of size O(kz). To our knowledge, this is the first algorithm for deletion-robust k-center clustering with a theoretical guarantee. Moreover, we accompany our theoretical results with extensive experiments, demonstrating that our algorithm achieves significantly better robustness than non-trivial baselines against three heuristic gray-box and white-box adversarial deletion attacks.
AB - The k-center clustering problem is of fundamental importance for a broad range of machine learning and data science applications. In this paper, we study the deletion-robust version of the problem. Specifically, we aim to extract a small subset of a given data set, referred to as a coreset, that contains a provably good set of k centers even after an adversary deletes up to z arbitrarily chosen points from the data set. We propose a 4-approximation algorithm that provides a coreset of size O(kz). To our knowledge, this is the first algorithm for deletion-robust k-center clustering with a theoretical guarantee. Moreover, we accompany our theoretical results with extensive experiments, demonstrating that our algorithm achieves significantly better robustness than non-trivial baselines against three heuristic gray-box and white-box adversarial deletion attacks.
KW - coreset
KW - deletion robustness
KW - k-center clustering
UR - https://www.scopus.com/pages/publications/85210032144
U2 - 10.1145/3627673.3679890
DO - 10.1145/3627673.3679890
M3 - 会议稿件
AN - SCOPUS:85210032144
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3877
EP - 3881
BT - CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
Y2 - 21 October 2024 through 25 October 2024
ER -