跳到主要导航 跳到搜索 跳到主要内容

Divide, compress and conquer: Querying XML via partitioned path-based compressed data blocks

  • Wilfred Ng*
  • , Ho Lam Lau
  • , Aoying Zhou
  • *此作品的通讯作者
  • Hong Kong University of Science and Technology
  • Fudan University

科研成果: 期刊稿件文章同行评审

摘要

We propose a novel partition path-based (PPB) grouping strategy to store compressed XML data in a stream of blocks. In addition, we employ a minimal indexing scheme called block statistic signature (BSS) on the compressed data, which is a simple but effective technique to support evaluation of selection and aggregate XPath queries of the compressed data. We present a formal analysis and empirical study of these techniques. The BSS indexing is first extended into effective cluster statistic signature (CSS) and multiple-cluster statistic signature (MSS) indexing by establishing more layers of indexes. We analyze how the response time is affected by various parameters involved in our compression strategy such as the data stream block size, the number of cluster layers, and the query selectivity. We also gain further insight about the compression and querying performance by studying the optimal block size in a stream, which leads to the minimum processing cost for queries. The cost model analysis provides a solid foundation for predicting the querying performance. Finally, we demonstrate that our PPB grouping and indexing strategies are not only efficient enough to support path-based selection and aggregate queries of the compressed XML data, but they also require relatively low computation time and storage space when compared with other state-of-the-art compression strategies.

源语言英语
页(从-至)169-197
页数29
期刊World Wide Web
11
2
DOI
出版状态已出版 - 6月 2008
已对外发布

指纹

探究 'Divide, compress and conquer: Querying XML via partitioned path-based compressed data blocks' 的科研主题。它们共同构成独一无二的指纹。

引用此