BlockSketch: A Hybrid Tree-Based Sketch for Keyword Search in Blockchain Systems

  • Yushi Liu
  • , Xiaodong Qi
  • , Zhao Zhang
  • , Yanqin Yang*
  • , Cheqing Jin
  • , Aoying Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Keyword search, which identifies transactions associated with specified keywords across historical blocks, is a critical query type in blockchain analytics. However, existing approaches, such as on-chain indexing and off-chain synchronization, may lead to significant space overhead or challenges in maintaining data freshness. To address these challenges, we propose BlockSketch, a novel probabilistic data structure (PDS) that adopts a differentiated encoding strategy, aimed at resolving the trade-off between query performance and storage overhead in blockchain indexing. BlockSketch features a hierarchical filtering architecture that combines Bloom filters and Sketches within a binary tree framework, enabling dynamic structural maintenance. Keywords are categorized as “hot” or “cold” based on their on-chain frequency and encoded into the most suitable component to achieve resource-efficient storage and accurate querying. In addition, BlockSketch integrates two distinct query rules, namely “level-down” and “jump,” to balance query accuracy and efficiency when processing keywords with varying frequencies. Furthermore, we enhance the query efficiency of BlockSketch by merging inefficient lower-level nodes into more compact ones and pruning redundant node checks during query execution. Extensive experiments on a real-world dataset demonstrate that BlockSketch delivers up to 73% faster query processing, achieves 44.56% of the average false positive rate of baselines at low multiplicity and as low as 1.52% at high multiplicity, and saves 79% in storage compared to state-of-the-art methods.

Original languageEnglish
JournalData Science and Engineering
DOIs
StateAccepted/In press - 2026

Keywords

  • Blockchain
  • Bloom filter
  • Keyword search
  • Sketch

Fingerprint

Dive into the research topics of 'BlockSketch: A Hybrid Tree-Based Sketch for Keyword Search in Blockchain Systems'. Together they form a unique fingerprint.

Cite this