TY - GEN
T1 - Tracking high quality clusters over uncertain data streams
AU - Zhang, Chen
AU - Gao, Ming
AU - Zhou, Aoying
PY - 2009
Y1 - 2009
N2 - Recently, data mining over uncertain data streams has attracted a lot of attentions because of the widely existed imprecise data generated from a variety of streaming applications. In this paper, we try to resolve the problem of clustering over uncertain data streams. Facing uncertain tuples with different probability distributions, the clustering algorithm should not only consider the tuple value but also emphasis on its uncertainty. To fulfill these dual purposes, a metric named tuple uncertainty will be integrated into the overall procedure of clustering. Firstly, we survey uncertain data model and propose our uncertainty measurement and corresponding properties. Secondly, based on such uncertainty quantification method, we provide a two phase stream clustering algorithm and elaborate implementation detail. Finally, performance experiments over a number of real and synthetic data sets demonstrate the effectiveness and efficiency of our method.
AB - Recently, data mining over uncertain data streams has attracted a lot of attentions because of the widely existed imprecise data generated from a variety of streaming applications. In this paper, we try to resolve the problem of clustering over uncertain data streams. Facing uncertain tuples with different probability distributions, the clustering algorithm should not only consider the tuple value but also emphasis on its uncertainty. To fulfill these dual purposes, a metric named tuple uncertainty will be integrated into the overall procedure of clustering. Firstly, we survey uncertain data model and propose our uncertainty measurement and corresponding properties. Secondly, based on such uncertainty quantification method, we provide a two phase stream clustering algorithm and elaborate implementation detail. Finally, performance experiments over a number of real and synthetic data sets demonstrate the effectiveness and efficiency of our method.
UR - https://www.scopus.com/pages/publications/67649641425
U2 - 10.1109/ICDE.2009.160
DO - 10.1109/ICDE.2009.160
M3 - 会议稿件
AN - SCOPUS:67649641425
SN - 9780769535456
T3 - Proceedings - International Conference on Data Engineering
SP - 1641
EP - 1648
BT - Proceedings - 25th IEEE International Conference on Data Engineering, ICDE 2009
T2 - 25th IEEE International Conference on Data Engineering, ICDE 2009
Y2 - 29 March 2009 through 2 April 2009
ER -