Cost-effective stream join algorithm on cloud system

  • Junhua Fang
  • , Rong Zhang*
  • , Xiaotong Wang
  • , Tom Z.J. Fu
  • , Zhenjie Zhang
  • , Aoying Zhou
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Matrix-based model perfectly supports distributed stream join operator, which generally applies to arbitrary join predicate and guarantees the completeness of the join results. However, high dynam-icity and uncertainty of real-world data stream call for better adap-tivity and lower operational cost, without which the stream join operator may suffer from performance drop and overpaid computation resource. Existing Join-Matrix model is unable to provide such capability, due to its fixed workload partitioning and difficulty on dynamic repartitioning. It is thus unclear how to take advantage of the load balancing benefits of Join-Matrix model while providing more flexibility to the distributed stream join computation at a lower cost. In this paper, we present a new cost-effective stream join algorithm, enhancing the adaptability of Join-Matrix model and minimizing the resource based on the varying workload. Our proposal includes a varietal matrix generation algorithm devised to build irregular matrix scheme for minimal task assignment; a lightweight migration algorithm designed to cut off unnecessary migration cost; and a load balancing framework to maximize the processing throughput. Extensive experiments are conducted to compare our proposal against state-of-the-art solutions on benchmark and real-world workloads, proving the effectiveness of our method, especially on reducing the operational cost under pay-as-you-go pricing scheme.

Original languageEnglish
Title of host publicationCIKM 2016 - Proceedings of the 2016 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages1773-1782
Number of pages10
ISBN (Electronic)9781450340731
DOIs
StatePublished - 24 Oct 2016
Event25th ACM International Conference on Information and Knowledge Management, CIKM 2016 - Indianapolis, United States
Duration: 24 Oct 201628 Oct 2016

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
Volume24-28-October-2016

Conference

Conference25th ACM International Conference on Information and Knowledge Management, CIKM 2016
Country/TerritoryUnited States
CityIndianapolis
Period24/10/1628/10/16

Keywords

  • Cost effective
  • Dstributed stream join
  • Matrix model
  • Theta-join

Fingerprint

Dive into the research topics of 'Cost-effective stream join algorithm on cloud system'. Together they form a unique fingerprint.

Cite this