跳到主要导航 跳到搜索 跳到主要内容

HOT: Hypergraph-based outlier test for categorical data

  • Fudan University
  • Simon Fraser University
  • Chinese University of Hong Kong

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

As a widely used data mining technique, outlier detection is a process which aims at finding anomalies with good explanations. Most existing methods are designed for numeric data. They will have problems with real-life applications that contain categorical data. In this paper, we introduce a novel outlier mining method based on a hypergraph model. Since hypergraphs precisely capture the distribution characteristics in data subspaces, this method is effective in identifying anomalies in dense subspaces and presents good interpretations for the local outlierness. By selecting the most relevant subspaces, the problem of "curse of dimensionality" in very large databases can also be ameliorated. Furthermore, the connectivity property is used to replace the distance metrics, so that the distance-based computation is not needed anymore, which enhances the robustness for handling missing-value data. The fact, that connectivity computation facilitates the aggregation operations supported by most SQL-compatible database systems, makes the mining process much efficient. Finally, experiments and analysis show that our method can find outliers in categorical data with good performance and quality.

源语言英语
主期刊名Advances in Knowledge Discovery and Data Mining
编辑Kyu-Young Wang, Jongwoo Jeon, Kyuseok Shim, Jaideep Srivastava
出版商Springer Verlag
399-410
页数12
ISBN(电子版)3540047603, 9783540047605
DOI
出版状态已出版 - 2003
已对外发布
活动7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2003 - Seoul, 韩国
期限: 30 4月 20032 5月 2003

出版系列

姓名Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
2637
ISSN(印刷版)0302-9743

会议

会议7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2003
国家/地区韩国
Seoul
时期30/04/032/05/03

指纹

探究 'HOT: Hypergraph-based outlier test for categorical data' 的科研主题。它们共同构成独一无二的指纹。

引用此