KLNCC: A new nonlinear correlation clustering algorithm based on KL-divergence

  • Chaofeng Sha*
  • , Xipeng Qiu
  • , Aoying Zhou
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The problem of finding correlation among subsets of features in high-dimensional data arises in many applications. There has been much work on finding those correlations, including linear and nonlinear correlation clusters. In this paper, we present KLNCC, a novel nonlinear correlation clustering algorithm which adopts a dynamic two-phase approach. In the first phase, we find microclusters by EM algorithm. In the second phase, these microclusters are merged in a bottom-up manner resulting in a dendrogram. The final clustering is determined by the users. When merging microclusters, we adopt the KL-divergence as the distance between two microclusters, which has explicit form when we use the EM clustering algorithm to find the microclusters. Our experimental evaluation on several real datasets demonstrates that KLNCC indeed discovers meaningful and accurate nonlinear correlation clusters.

Original languageEnglish
Title of host publicationProceedings - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008
Pages125-130
Number of pages6
DOIs
StatePublished - 2008
Externally publishedYes
Event2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008 - Sydney, NSW, Australia
Duration: 8 Jul 200811 Jul 2008

Publication series

NameProceedings - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008

Conference

Conference2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008
Country/TerritoryAustralia
CitySydney, NSW
Period8/07/0811/07/08

Fingerprint

Dive into the research topics of 'KLNCC: A new nonlinear correlation clustering algorithm based on KL-divergence'. Together they form a unique fingerprint.

Cite this