Abstract
Uncertain data management is becoming an important research focus. Uncertain management of XML data which is the main store and exchange standard of web data is naturally becoming a hot point. One of the branches is keyword-based search over probabilistic XML. In recent work of keyword search over probabilistic XML, only the independent and the mutually-exclusive relationships among sibling nodes have been discussed. Because of the complexity of representation and computation, more general relationship among sibling nodes has got little attention up to now. This paper addresses the problem of keyword filtering over probabilistic XML data model PrXML{exp, ind, mux}. In the model, exp node is used to represent more general relationship among sibling nodes. tab is defined as keyword distribution probability table of one subtree. The dot product, Cartesian product, and addition operation of tab are also defined. Then the computation of different type of nodes' tab are given. Furthermore, an algorithm of how to obtain SLCAs and the probability of being a SLCA node is also given without generating possible worlds. Finally, the features and efficiency of our method are evaluated with extensive experimental results.
| Original language | English |
|---|---|
| Pages (from-to) | 1959-1971 |
| Number of pages | 13 |
| Journal | Jisuanji Xuebao/Chinese Journal of Computers |
| Volume | 37 |
| Issue number | 9 |
| DOIs | |
| State | Published - 1 Sep 2014 |
Keywords
- Keyword distribution probability table
- Keywords filtering
- Probabilistic XML
- Smallest lowest common ancestor
- Uncertain data