Approximate Calculation of Window Aggregate Functions via Global Random Sample

  • Guangxuan Song
  • , Wenwen Qu
  • , Xiaojie Liu
  • , Xiaoling Wang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Window functions have been a part of the SQL standard since 2003 and have been studied extensively during the past decade. They are widely used in data analysis; almost all the current mainstream commercial databases support window functions. However, in recent years the size of datasets is growing steeply; the existing window function implementations are not efficient enough. Recently, some sampling-based algorithms (e.g., online aggregation) are proposed to deal with large and complex data in relational databases, which offer us a flexible trade-off between accuracy and efficiency. However, few sampling techniques has been considered for window functions in databases. In this paper, we extend our previous work (Song et al. in Asia-Pacific web and web-age information management joint conference on web and big data, Springer, pp 229–244, 2017) and proposed two new algorithms: range-based global sampling algorithm and row-labeled sampling algorithm. The proposed algorithms use global sampling rather than local sampling and are more efficient than other existing algorithms. And we find our proposed algorithms out performed the baseline method over the TPC-H benchmark dataset.

Original languageEnglish
Pages (from-to)40-51
Number of pages12
JournalData Science and Engineering
Volume3
Issue number1
DOIs
StatePublished - 1 Mar 2018

Keywords

  • Query optimization
  • Sample
  • Window function

Fingerprint

Dive into the research topics of 'Approximate Calculation of Window Aggregate Functions via Global Random Sample'. Together they form a unique fingerprint.

Cite this