HATDC: A holistic approach for time series data repairing

  • Xiaojie Liu
  • , Guangxuan Song
  • , Xiaoling Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Time series data is prevalent in real life, and time series data mining is also a hot research topic nowadays. However, there may exist lots of anomalous data caused by sensor error in the real data sets, which brings difficulties for data mining. To improve the quality of data mining, it is to repair the data before data analysis. Most of the existing repairing methods use smooth-based or constraint-based techniques, but they only consider a few adjacent points and ignore global holistic information. In this paper, we propose a novel time series data repairing algorithm, named HATDC, that can exploit the holistic information of the time series. First, we use speed constraints and the probability distribution of change rates to detect the dirty data points. After that, the dynamic time warping (DTW) is applied as the distance measure to find similar subsequences in the series, and we estimate the value of these abnormal data points according to the selected similar subsequences from the whole aspect. In addition, we propose an improved algorithm for reducing the time cost based on incremental clustering. Experiments on several real datasets demonstrate that HATDC has a significantly higher repair accuracy and a lower RMS error than other methods.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 23rd Pacific-Asia Conference, PAKDD 2019, Proceedings
EditorsZhiguo Gong, Min-Ling Zhang, Sheng-Jun Huang, Zhi-Hua Zhou, Qiang Yang
PublisherSpringer Verlag
Pages553-564
Number of pages12
ISBN (Print)9783030161446
DOIs
StatePublished - 2019
Event23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2019 - Macau, China
Duration: 14 Apr 201917 Apr 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11440 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2019
Country/TerritoryChina
CityMacau
Period14/04/1917/04/19

Keywords

  • Anomaly detection
  • DTW
  • Data repairing
  • Time series

Fingerprint

Dive into the research topics of 'HATDC: A holistic approach for time series data repairing'. Together they form a unique fingerprint.

Cite this