TY - GEN
T1 - KLNCC
T2 - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008
AU - Sha, Chaofeng
AU - Qiu, Xipeng
AU - Zhou, Aoying
PY - 2008
Y1 - 2008
N2 - The problem of finding correlation among subsets of features in high-dimensional data arises in many applications. There has been much work on finding those correlations, including linear and nonlinear correlation clusters. In this paper, we present KLNCC, a novel nonlinear correlation clustering algorithm which adopts a dynamic two-phase approach. In the first phase, we find microclusters by EM algorithm. In the second phase, these microclusters are merged in a bottom-up manner resulting in a dendrogram. The final clustering is determined by the users. When merging microclusters, we adopt the KL-divergence as the distance between two microclusters, which has explicit form when we use the EM clustering algorithm to find the microclusters. Our experimental evaluation on several real datasets demonstrates that KLNCC indeed discovers meaningful and accurate nonlinear correlation clusters.
AB - The problem of finding correlation among subsets of features in high-dimensional data arises in many applications. There has been much work on finding those correlations, including linear and nonlinear correlation clusters. In this paper, we present KLNCC, a novel nonlinear correlation clustering algorithm which adopts a dynamic two-phase approach. In the first phase, we find microclusters by EM algorithm. In the second phase, these microclusters are merged in a bottom-up manner resulting in a dendrogram. The final clustering is determined by the users. When merging microclusters, we adopt the KL-divergence as the distance between two microclusters, which has explicit form when we use the EM clustering algorithm to find the microclusters. Our experimental evaluation on several real datasets demonstrates that KLNCC indeed discovers meaningful and accurate nonlinear correlation clusters.
UR - https://www.scopus.com/pages/publications/51849118858
U2 - 10.1109/CIT.2008.4594661
DO - 10.1109/CIT.2008.4594661
M3 - 会议稿件
AN - SCOPUS:51849118858
SN - 9781424423583
T3 - Proceedings - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008
SP - 125
EP - 130
BT - Proceedings - 2008 IEEE 8th International Conference on Computer and Information Technology, CIT 2008
Y2 - 8 July 2008 through 11 July 2008
ER -