跳到主要导航 跳到搜索 跳到主要内容

CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

  • Xiang Li
  • , Ben Kao
  • , Caihua Shan
  • , Dawei Yin
  • , Martin Ester
  • The University of Hong Kong
  • JD.com, Inc.
  • Simon Fraser University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

We study the problem of applying spectral clustering to cluster multi-scale data, which is data whose clusters are of various sizes and densities. Traditional spectral clustering techniques discover clusters by processing a similarity matrix that reflects the proximity of objects. For multi-scale data, distance-based similarity is not effective because objects of a sparse cluster could be far apart while those of a dense cluster have to be sufficiently close. Following [16], we solve the problem of spectral clustering on multi-scale data by integrating the concept of objects' "reachability similarity" with a given distance-based similarity to derive an objects' coefficient matrix. We propose the algorithm CAST that applies trace Lasso to regularize the coefficient matrix. We prove that the resulting coefficient matrix has the "grouping effect" and that it exhibits "sparsity". We show that these two characteristics imply very effective spectral clustering. We evaluate CAST and 10 other clustering methods on a wide range of datasets w.r.t. various measures. Experimental results show that CAST provides excellent performance and is highly robust across test cases of multi-scale data.

源语言英语
主期刊名KDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
出版商Association for Computing Machinery
439-449
页数11
ISBN(电子版)9781450379984
DOI
出版状态已出版 - 23 8月 2020
已对外发布
活动26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020 - Virtual, Online, 美国
期限: 23 8月 202027 8月 2020

出版系列

姓名Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

会议

会议26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020
国家/地区美国
Virtual, Online
时期23/08/2027/08/20

指纹

探究 'CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data' 的科研主题。它们共同构成独一无二的指纹。

引用此