Incremental mining of schema for semistructured data

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Semistructured data is specified by the lack of any fixed and rigid schema, even though typically some implicit structure appears in the data. The huge amounts of on-line applications make it important and imperative to mine schema of semistructured data, both for the users (e.g., to gather useful information and facilitate querying) and for the systems (e.g., to optimize access). The critical problem is to discover the implicit structure in the semistructured data. Current methods in extracting Web data structure are either in a general way independent of application background [8], [9], or bound in some concrete environment such as HTML etc [13], [14], [15]. But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data. In this paper, we first deal with the problem of incremental mining of schema for semistructured data after the update of the raw data. An algorithm for incrementally mining schema of semistructured data is provided, and some experimental results are also given, which shows that our incremental mining for semistructured data is more efficient than non-incremental mining.

Original languageEnglish
Title of host publicationMethodologies for Knowledge Discovery and Data Mining - 3rd Pacific-Asia Conference, PAKDD 1999, Proceedings
EditorsNing Zhong, Lizhu Zhou
PublisherSpringer Verlag
Pages159-168
Number of pages10
ISBN (Print)3540658661, 9783540658665
DOIs
StatePublished - 1999
Externally publishedYes
Event3rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1999 - Beijing, China
Duration: 26 Apr 199928 Apr 1999

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1574
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1999
Country/TerritoryChina
CityBeijing
Period26/04/9928/04/99

Keywords

  • Algorithm
  • Data mining
  • Incremental mining
  • Schema
  • Semistructured data

Fingerprint

Dive into the research topics of 'Incremental mining of schema for semistructured data'. Together they form a unique fingerprint.

Cite this