Abstract
Organization data is confronted all kinds of data quality problems. Thus the process of data cleaning becomes crucial because of the "garbage in, garbage out" principle. However, it's not trivial to make data cleaning process flexible enough. In this paper, we present an open and extensible framework for data cleaning. It gains its extensibility by employing innovative features like term model, processing description file and rule&Dic base. A visual GUI environment is implemented and workflow capability is provided in this system.
| Original language | English |
|---|---|
| Pages (from-to) | 189-192 |
| Number of pages | 4 |
| Journal | Journal of Shanghai University |
| Volume | 5 |
| Issue number | SUPPL. SEPT. |
| State | Published - Sep 2001 |
| Externally published | Yes |
| Event | 2nd International Conference on Computer and Information Technology (CIT'2001) - Shangai, China Duration: 12 Sep 2001 → 15 Sep 2001 |
Keywords
- Data cleaning
- Data preparation
- Term model