TY - JOUR
T1 - Functional dependency and conditional constraint based data repair
AU - Jin, Che Qing
AU - Liu, Hui Ping
AU - Zhou, Ao Ying
N1 - Publisher Copyright:
© Copyright 2016, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
PY - 2016/7/1
Y1 - 2016/7/1
N2 - Along with the development of economy and information technology, a large amount of data are produced in many applications. However, due to the influence of some factors, such as hardware equipments, manual operations, and multi-source data integration, serious data quality issues sunch as data inconsistency arise, which makes it more challenging to manage data effectively. Hence, it is crucial to develop new data cleaning technology to improve data quality to better support data management and analysis. Existing work in this area mainly focuses on the situation where functional dependencies are used to describe data inconsistency. Once some violations are detected, some tuples must be changed to suit for the functional dependency set via update (instead of insert or delete). Besides functional dependency, there also exist other types of constraints, such as the hard constraint, quantity constraint, equivalent constraint, and non-equivalent constraint. However, it becomes more difficult when more inconsistent conditions are involved. This paper addresses the general scenario where functional dependencies and other constraints co-exist. Corresponding data repair algorithm is designed to improve the data quality effectively. Experimental results show that the proposed method performs effectively and efficiently.
AB - Along with the development of economy and information technology, a large amount of data are produced in many applications. However, due to the influence of some factors, such as hardware equipments, manual operations, and multi-source data integration, serious data quality issues sunch as data inconsistency arise, which makes it more challenging to manage data effectively. Hence, it is crucial to develop new data cleaning technology to improve data quality to better support data management and analysis. Existing work in this area mainly focuses on the situation where functional dependencies are used to describe data inconsistency. Once some violations are detected, some tuples must be changed to suit for the functional dependency set via update (instead of insert or delete). Besides functional dependency, there also exist other types of constraints, such as the hard constraint, quantity constraint, equivalent constraint, and non-equivalent constraint. However, it becomes more difficult when more inconsistent conditions are involved. This paper addresses the general scenario where functional dependencies and other constraints co-exist. Corresponding data repair algorithm is designed to improve the data quality effectively. Experimental results show that the proposed method performs effectively and efficiently.
KW - Conditional constraint
KW - Data quality
KW - Data repair
KW - Equivalence class
KW - Functional dependency
UR - https://www.scopus.com/pages/publications/84978039779
U2 - 10.13328/j.cnki.jos.005037
DO - 10.13328/j.cnki.jos.005037
M3 - 文章
AN - SCOPUS:84978039779
SN - 1000-9825
VL - 27
SP - 1671
EP - 1684
JO - Ruan Jian Xue Bao/Journal of Software
JF - Ruan Jian Xue Bao/Journal of Software
IS - 7
ER -