CDDTA-JOIN: One-pass OLAP algorithm for column-oriented databases

Min Jiao, Yansong Zhang, Yan Sun, Shan Wang, Xuan Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Row-store commonly uses a volcano-style "once-a-tuple" pipeline processor for processing efficiency but looses the I/O efficiency when only a small part of columns are accessed in a wide table. The academic column-store usually uses "once-a-column" style processing for I/O and cache efficiency but it has to suffer multi-pass column scan for complex query. This paper focuses on how to achieve the maximal gains from storage models for both pipeline processing efficiency and column processing efficiency. Based on the "address-value" mapping for surrogate key in dimension table, we can map incremental primary keys as offset addresses, so the foreign keys in fact table can be utilized as native join index for dimensional tuples. We use predicate vector as bitmap vector filters for dimensions to enable star-join as pipeline operator and pre-generate hash aggregators for aggregat based on the column. Using these approaches, star-join and pre-grouping can be completed in one-pass scan on dimensional attributes in fact table, and the following aggregate column scanning responses for the sparse accessing aggregation. We can gain both I/O efficiency for vector processing and CPU efficiency for pipeline aggregating. We perform the experiments for both simulated algorithm based on the column and the commercial column-store database.

Original languageEnglish
Title of host publicationWeb Technologies and Applications - 14th Asia-Pacific Web Conference, APWeb 2012, Proceedings
Pages448-459
Number of pages12
DOIs
StatePublished - 2012
Externally publishedYes
Event14th Asia Pacific Web Technology Conference, APWeb 2012 - Kunming, China
Duration: 11 Apr 201213 Apr 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7235 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th Asia Pacific Web Technology Conference, APWeb 2012
Country/TerritoryChina
CityKunming
Period11/04/1213/04/12

Keywords

  • CDDTA-JOIN
  • OLAP
  • column-store
  • predicate-vector

Fingerprint

Dive into the research topics of 'CDDTA-JOIN: One-pass OLAP algorithm for column-oriented databases'. Together they form a unique fingerprint.

Cite this