TY - GEN
T1 - Approximately processing multi-granularity aggregate queries over data streams
AU - Qin, Shouke
AU - Qian, Weining
AU - Zhou, Aoying
PY - 2006
Y1 - 2006
N2 - Aggregate monitoring over data streams is attracting more and more attention in research community due to its broad potential applications. Existing methods suffer two problems, 1) The aggregate functions which could be monitored are restricted to be first-order statistic or monotonic with respect to the window size. 2) Only a limited number of granularity and time scales could be monitored over a stream, thus some interesting patterns might be neglected, and users might be misled by the incomplete changing profile about current data streams. These two impede the development of online mining techniques over data streams, and some kind of breakthrough is urged. In this paper, we employed the powerful tool of fractal analysis to enable the monitoring of both monotonic and non-monotonic aggregates on time-changing data streams. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for accessing the synopsis from O(m) to O(log m), where m is the number of windows to be monitored. With the help of a novel inverted histogram, the statistical summary is compressed to befit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line. Theoretical analysis show the space and time complexity bound of this method are relatively low, while experimental results prove the applicability and efficiency of the proposed algorithm in different application settings.
AB - Aggregate monitoring over data streams is attracting more and more attention in research community due to its broad potential applications. Existing methods suffer two problems, 1) The aggregate functions which could be monitored are restricted to be first-order statistic or monotonic with respect to the window size. 2) Only a limited number of granularity and time scales could be monitored over a stream, thus some interesting patterns might be neglected, and users might be misled by the incomplete changing profile about current data streams. These two impede the development of online mining techniques over data streams, and some kind of breakthrough is urged. In this paper, we employed the powerful tool of fractal analysis to enable the monitoring of both monotonic and non-monotonic aggregates on time-changing data streams. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for accessing the synopsis from O(m) to O(log m), where m is the number of windows to be monitored. With the help of a novel inverted histogram, the statistical summary is compressed to befit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line. Theoretical analysis show the space and time complexity bound of this method are relatively low, while experimental results prove the applicability and efficiency of the proposed algorithm in different application settings.
UR - https://www.scopus.com/pages/publications/33749625585
U2 - 10.1109/ICDE.2006.22
DO - 10.1109/ICDE.2006.22
M3 - 会议稿件
AN - SCOPUS:33749625585
SN - 0769525709
SN - 9780769525709
T3 - Proceedings - International Conference on Data Engineering
SP - 67
BT - Proceedings of the 22nd International Conference on Data Engineering, ICDE '06
T2 - 22nd International Conference on Data Engineering, ICDE '06
Y2 - 3 April 2006 through 7 April 2006
ER -