AQapprox: Aggregation Queries Approximation with Distribution-Aware Online Sampling

Han Wu, Xiaoling Wang, Xingjian Lu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Approximate query processing (AQP) is an effective way to provide approximate results for SQL queries, which relaxing accuracy in exchange for higher processing speed. In sampling-based AQP techniques, random sampling works well for uniformly distributed data but performs poorly on skewed data. To address this problem, we propose a distribution-aware approximation framework called AQapprox (aggregation queries approximation), to approximate queries more efficiently and accurately by extending Sapprox. We construct a probabilistic Map, which records the occurrences of sub-datasets on categorical columns and related statistics on numerical columns at each segment of the whole dataset. When a query arrives, AQapprox will combine Map and adaptively use different sampling methods based on the distribution. Experimental results on both real and synthetic datasets show that AQapprox can achieve a speedup by up to 5.9 for skewed data, 64 for uniform data over Sapprox, and has higher accuracy on multi-column queries.

Original languageEnglish
Title of host publicationWeb Information Systems Engineering – WISE 2020 - 21st International Conference, Proceedings
EditorsZhisheng Huang, Wouter Beek, Hua Wang, Yanchun Zhang, Rui Zhou
PublisherSpringer Science and Business Media Deutschland GmbH
Pages404-416
Number of pages13
ISBN (Print)9783030620073
DOIs
StatePublished - 2020
Event21st International Conference on Web Information Systems Engineering, WISE 2020 - Amsterdam, Netherlands
Duration: 20 Oct 202024 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12343 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Web Information Systems Engineering, WISE 2020
Country/TerritoryNetherlands
CityAmsterdam
Period20/10/2024/10/20

Keywords

  • AQP
  • Distribution-aware approximation
  • Probabilistic map

Fingerprint

Dive into the research topics of 'AQapprox: Aggregation Queries Approximation with Distribution-Aware Online Sampling'. Together they form a unique fingerprint.

Cite this