Approximately processing multi-granularity aggregate queries over data streams

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Scopus citations

Abstract

Aggregate monitoring over data streams is attracting more and more attention in research community due to its broad potential applications. Existing methods suffer two problems, 1) The aggregate functions which could be monitored are restricted to be first-order statistic or monotonic with respect to the window size. 2) Only a limited number of granularity and time scales could be monitored over a stream, thus some interesting patterns might be neglected, and users might be misled by the incomplete changing profile about current data streams. These two impede the development of online mining techniques over data streams, and some kind of breakthrough is urged. In this paper, we employed the powerful tool of fractal analysis to enable the monitoring of both monotonic and non-monotonic aggregates on time-changing data streams. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for accessing the synopsis from O(m) to O(log m), where m is the number of windows to be monitored. With the help of a novel inverted histogram, the statistical summary is compressed to befit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line. Theoretical analysis show the space and time complexity bound of this method are relatively low, while experimental results prove the applicability and efficiency of the proposed algorithm in different application settings.

Original languageEnglish
Title of host publicationProceedings of the 22nd International Conference on Data Engineering, ICDE '06
Pages67
Number of pages1
DOIs
StatePublished - 2006
Externally publishedYes
Event22nd International Conference on Data Engineering, ICDE '06 - Atlanta, GA, United States
Duration: 3 Apr 20067 Apr 2006

Publication series

NameProceedings - International Conference on Data Engineering
Volume2006
ISSN (Print)1084-4627

Conference

Conference22nd International Conference on Data Engineering, ICDE '06
Country/TerritoryUnited States
CityAtlanta, GA
Period3/04/067/04/06

Fingerprint

Dive into the research topics of 'Approximately processing multi-granularity aggregate queries over data streams'. Together they form a unique fingerprint.

Cite this