One-size-fits-all OLAP technique for big data analysis

  • Yan Song Zhang*
  • , Min Jiao
  • , Zhan Wei Wang
  • , Shan Wang
  • , Xuan Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

The traditional OLAP is pushed into large scale analysis era by rapidly expending big data volume. The major features are high storage density, heavy workload, large scale storage and processing capacity. Both traditional parallel database and the hot topic MapReduce technique have to face the critical issues of performance and parallel processing efficiency of big data analytical processing in large scale parallel processing framework. The performance of star schema based OLAP with star-join is limited by processing complexity and network transmission cost in parallel processing. This paper makes a deep analysis of features of storage model and workload of OLAP, proposes the optimization mechanisms and implementation technologies for the most fundamental SPJGA-OLAP subset in storage, processing, distribution, network transmission, and distributed buffering. The technical feasibility is evaluated with the commonly accepted TPC-H industrial benchmark and SSB academic benchmark. This paper proposes the predicate-vector DDTA-JOIN centric parallel OLAP framework, replacing the diverse join execution plans with normalized predicate-vector processing, and enables one-size-fits-all OLAP model for both central processing and large scale parallel processing by making advantage of nowadays hardware, minimizing network transmission cost and processing cost. The analysis of the storage cost and network transmission cost for distribution mechanism with datasets of 1TB and 100TB is given. The technical feasibility and parallel processing efficiency are verified by OLAP cost model analysis and real data experiments.

Original languageEnglish
Pages (from-to)1936-1946
Number of pages11
JournalJisuanji Xuebao/Chinese Journal of Computers
Volume34
Issue number10
DOIs
StatePublished - Oct 2011
Externally publishedYes

Keywords

  • Big data analytical processing
  • OLAP
  • Predicate-vector
  • Star schema

Fingerprint

Dive into the research topics of 'One-size-fits-all OLAP technique for big data analysis'. Together they form a unique fingerprint.

Cite this