跳到主要导航 跳到搜索 跳到主要内容

XML structural similarity search using MapReduce

  • Fudan University
  • Shanghai Key Laboratory of Intelligent Information Processing

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

XML is a de-facto standard for web data exchange and information representation. Efficient management of these large volumes of XML data brings challenges to conventional technique. To cope with large scale data, MapReduce computing framework as an efficient solution has attracted more and more attention in the database community recently. In this paper, an efficient and scalable framework is proposed for XML structural similarity search on large cluster with MapReduce. First, sub-structures of XML structure are extracted from large XML corpus located on a large cluster in parallel. Then Min-Hashing and locality sensitive hashing techniques are developed on the distributed and parallel computing framework for efficient structural similarity search processing. An empirical study on the cluster with real large datasets demonstrates the effectiveness and efficiency of our approach.

源语言英语
主期刊名Web-Age Information Management - 11th International Conference, WAIM 2010, Proceedings
169-181
页数13
DOI
出版状态已出版 - 2010
活动11th International Conference on Web-Age Information Management, WAIM 2010 - Jiuzhaigou, 中国
期限: 15 7月 201017 7月 2010

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
6184 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议11th International Conference on Web-Age Information Management, WAIM 2010
国家/地区中国
Jiuzhaigou
时期15/07/1017/07/10

指纹

探究 'XML structural similarity search using MapReduce' 的科研主题。它们共同构成独一无二的指纹。

引用此