Mining non-redundant diverse patterns: An information theoretic perspective

Chaofeng Sha, Jian Gong, Aoying Zhou

Research output: Contribution to journalArticlepeer-review

Abstract

The discovery of diversity patterns from binary data is an important data mining task. In this paper, we propose the problem of mining highly diverse patterns called non-redundant diversity patterns (NDPs). In this framework, entropy is adopted to measure the diversity of itemsets. In addition, an algorithm called NDP miner is proposed to exploit both monotone properties of entropy diversity measure and pruning power for the efficient discovery of non-redundant diversity patterns. Finally, our experimental results are given to show that the NDP miner can efficiently identify non-redundant diversity patterns.

Original languageEnglish
Pages (from-to)89-99
Number of pages11
JournalFrontiers of Computer Science in China
Volume4
Issue number1
DOIs
StatePublished - Feb 2010

Keywords

  • Depth-first search
  • Diverse pattern
  • Entropy

Fingerprint

Dive into the research topics of 'Mining non-redundant diverse patterns: An information theoretic perspective'. Together they form a unique fingerprint.

Cite this