Adaptive dimension reduction for clustering high dimensional data

  • Chris Ding*
  • , Xiaofeng He
  • , Hongyuan Zha
  • , Horst D. Simon
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

173 Scopus citations

Abstract

It is well-known that for high dimensional data clustering, standard algorithms such as EM and the K-means are often trapped in local minimum. Many initialization methods were proposed to tackle this problem, but with only limited success. In this paper we propose a new approach to resolve this problem by repeated dimension reductions such that K-means or EM are performed only in very low dimensions. Cluster membership is utilized as a bridge between the reduced dimensional subspace and the original space, providing flexibility and ease of implementation. Clustering analysis performed on highly overlapped Gaussians, DNA gene expression profiles and internet newsgroups demonstrate the effectiveness of the proposed algorithm.

Original languageEnglish
Title of host publicationProceedings - 2002 IEEE International Conference on Data Mining, ICDM 2002
Pages147-154
Number of pages8
StatePublished - 2002
Externally publishedYes
Event2nd IEEE International Conference on Data Mining, ICDM '02 - Maebashi, Japan
Duration: 9 Dec 200212 Dec 2002

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference2nd IEEE International Conference on Data Mining, ICDM '02
Country/TerritoryJapan
CityMaebashi
Period9/12/0212/12/02

Fingerprint

Dive into the research topics of 'Adaptive dimension reduction for clustering high dimensional data'. Together they form a unique fingerprint.

Cite this