Distributed SLCA-based XML keyword search by map-reduce

Chenjing Zhang, Qiang Ma, Xiaoling Wang, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Large scales of XML information comes continually from new Web applications, and SLCA (Smallest Lowest Common Ancestor)-based XML keyword search is one of the most important information retrieval approaches. Previous approaches focus on building index for XML documents. However in information dissemination scenario, it is impossible to build index in advance for continuous XML document streams. This paper addresses SLCA-based keyword search for continuous XML documents by Map-Reduce mechanism. We use parallel algorithms to process plenty of XML documents in Hadoop environment. A distributed SLCA computation method is designed, where each net node computes SLCA independently and just a little information needs be transmitted. A real Hadoop environment is built and we demonstrate the efficiency of our algorithms analytically and experimentally.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 15th International Conference, DASFAA 2010, International Workshops
Subtitle of host publicationGDM, BenchmarX, MCIS, SNSMW, DIEW, UDM, Revised Selected Papers
Pages386-397
Number of pages12
DOIs
StatePublished - 2010
Event15th International Conference on Database Systems for Advanced Applications, DASFAA 2010 - Tsukuba, Japan
Duration: 1 Apr 20104 Apr 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6193 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Conference on Database Systems for Advanced Applications, DASFAA 2010
Country/TerritoryJapan
CityTsukuba
Period1/04/104/04/10

Keywords

  • SLCA
  • XML
  • distributed system
  • keyword search

Fingerprint

Dive into the research topics of 'Distributed SLCA-based XML keyword search by map-reduce'. Together they form a unique fingerprint.

Cite this