跳到主要导航 跳到搜索 跳到主要内容

K-nearest-neighbor consistency in data clustering: Incorporating local information into global optimization

  • Lawrence Berkeley National Laboratory

科研成果: 会议稿件论文同行评审

摘要

Nearest neighbor consistency is a central concept in statistical pattern recognition, especially the kNN classification methods and its strong theoretical foundation. In this paper, we extend this concept to data clustering, requiring that for any data point in a cluster, its k-nearest neighbors and mutual nearest neighbors should also be in the same cluster. We study properties of the cluster k-nearest neighbor consistency and propose kNN and kMN consistency enforcing and improving algorithms. Extensive experiments on internet newsgroup datasets using the K-means clustering algorithm with kNN consistency enhancement show that kNN/kMN consistency can be improved significantly (about 100% for 1MN and 1NN consistencies) while the clustering accuracy is improved simultaneously. This indicates the local consistency information helps the global cluster objective function optimization.

源语言英语
584-589
页数6
出版状态已出版 - 2004
已对外发布
活动Applied Computing 2004 - Proceedings of the 2004 ACM Symposium on Applied Computing - Nicosia, 塞浦路斯
期限: 14 3月 200417 3月 2004

会议

会议Applied Computing 2004 - Proceedings of the 2004 ACM Symposium on Applied Computing
国家/地区塞浦路斯
Nicosia
时期14/03/0417/03/04

指纹

探究 'K-nearest-neighbor consistency in data clustering: Incorporating local information into global optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此