Tracking high quality clusters over uncertain data streams

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

36 Scopus citations

Abstract

Recently, data mining over uncertain data streams has attracted a lot of attentions because of the widely existed imprecise data generated from a variety of streaming applications. In this paper, we try to resolve the problem of clustering over uncertain data streams. Facing uncertain tuples with different probability distributions, the clustering algorithm should not only consider the tuple value but also emphasis on its uncertainty. To fulfill these dual purposes, a metric named tuple uncertainty will be integrated into the overall procedure of clustering. Firstly, we survey uncertain data model and propose our uncertainty measurement and corresponding properties. Secondly, based on such uncertainty quantification method, we provide a two phase stream clustering algorithm and elaborate implementation detail. Finally, performance experiments over a number of real and synthetic data sets demonstrate the effectiveness and efficiency of our method.

Original languageEnglish
Title of host publicationProceedings - 25th IEEE International Conference on Data Engineering, ICDE 2009
Pages1641-1648
Number of pages8
DOIs
StatePublished - 2009
Event25th IEEE International Conference on Data Engineering, ICDE 2009 - Shanghai, China
Duration: 29 Mar 20092 Apr 2009

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Conference

Conference25th IEEE International Conference on Data Engineering, ICDE 2009
Country/TerritoryChina
CityShanghai
Period29/03/092/04/09

Fingerprint

Dive into the research topics of 'Tracking high quality clusters over uncertain data streams'. Together they form a unique fingerprint.

Cite this