TY - JOUR
T1 - A survey on management of data provenance
AU - Gao, Ming
AU - Jin, Che Qing
AU - Wang, Xiao Ling
AU - Tian, Xiu Xia
AU - Zhou, Ao Ying
PY - 2010/3
Y1 - 2010/3
N2 - The data provenance describes about how data is generated and evolves with time going on, which has many applications, including evaluation of data quality, audit trail, replication recipes, data citation, etc. Generally, the data provenance could be recorded among multiple sources, or just within a single data source. In other words, the derivation history of data could take place either in schema level, or in instance level. This paper surveys the researches about presentation and query of data provenance both in schema level and instance level. For the schema level, the focus is on query rewriting and schema mappings, and for the instance level, the focus includes relational data provenance, XML data provenance, streaming data provenance. Moreover, the research efforts of uncertain data provenance to track the derivation of data and uncertainty are also summarized. Finally, this paper lists applications of the data provenance, discusses the main challenges, and points out some research issues in future.
AB - The data provenance describes about how data is generated and evolves with time going on, which has many applications, including evaluation of data quality, audit trail, replication recipes, data citation, etc. Generally, the data provenance could be recorded among multiple sources, or just within a single data source. In other words, the derivation history of data could take place either in schema level, or in instance level. This paper surveys the researches about presentation and query of data provenance both in schema level and instance level. For the schema level, the focus is on query rewriting and schema mappings, and for the instance level, the focus includes relational data provenance, XML data provenance, streaming data provenance. Moreover, the research efforts of uncertain data provenance to track the derivation of data and uncertainty are also summarized. Finally, this paper lists applications of the data provenance, discusses the main challenges, and points out some research issues in future.
KW - Data integration
KW - Data provenance
KW - Data space
KW - Provenance semiring
KW - Uncertain data
UR - https://www.scopus.com/pages/publications/77951882028
U2 - 10.3724/SP.J.1016.2010.00373
DO - 10.3724/SP.J.1016.2010.00373
M3 - 文章
AN - SCOPUS:77951882028
SN - 0254-4164
VL - 33
SP - 373
EP - 389
JO - Jisuanji Xuebao/Chinese Journal of Computers
JF - Jisuanji Xuebao/Chinese Journal of Computers
IS - 3
ER -