Filtering duplicate items over distributed data streams

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

In recent years many real time applications need to handle data streams. We consider the distributed environments in which remote data sources keep on collecting data from real world or from other data sources, and continuously push the data to a central stream processor. In these kinds of environments, significant communication is induced by the transmitting of rapid, high-volume and time-varying data streams. At the same time, the computing overhead at the central processor is also incurred. In this paper, we develop a novel filter approach, called DTFilter approach, for evaluating the windowed distinct queries in such a distributed system. DTFilter approach is based on the searching algorithm using a data structure of two height-balanced trees, and it avoids transmitting duplicate items in data streams, thus lots of network resources are saved. In addition, theoretical analysis of the time spent in performing the search, and of the amount of memory needed is provided. Extensive experiments also show that DTFilter approach owns high performance.

Original languageEnglish
Title of host publicationAdvances in Web-Age Information Management - 6th International Conference, WAIM 2005, Proceedings
PublisherSpringer Verlag
Pages779-784
Number of pages6
ISBN (Print)3540292276, 9783540292272
DOIs
StatePublished - 2005
Externally publishedYes
Event6th International Conference on Advances in Web-Age Information Management, WAIM 2005 - Hangzhou, China
Duration: 11 Oct 200513 Oct 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3739 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Advances in Web-Age Information Management, WAIM 2005
Country/TerritoryChina
CityHangzhou
Period11/10/0513/10/05

Fingerprint

Dive into the research topics of 'Filtering duplicate items over distributed data streams'. Together they form a unique fingerprint.

Cite this