Automatic filtering algorithm for imbalanced classification

  • Wei Gong*
  • , Youjie Zhou
  • , Hangzai Luo
  • , Jianping Fan
  • , Aoying Zhou
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The imbalanced data set has been reported to hinder the classification performance of many machine learning algorithms on both accuracy and speed. But extremely imbalanced data sets (3~5% positive samples) are common for many applications, such as multimedia semantic classification. In this paper, we propose a novel algorithm to automatically remove samples that have no or negative effects on classifier training for imbalanced training data sets. By using our algorithm, most easy-to-classify dominant-class samples in imbalanced training set will be eliminated automatically. As a result, the ratio of minority class samples is increased significantly, making it more suitable for classification algorithms. Experiments show that our algorithm can keep the classification accuracy of SVM, and decrease the training time dramatically.

Original languageEnglish
Title of host publicationProceedings - 2010 7th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2010
Pages1853-1857
Number of pages5
DOIs
StatePublished - 2010
Event2010 7th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2010 - Yantai, Shandong, China
Duration: 10 Aug 201012 Aug 2010

Publication series

NameProceedings - 2010 7th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2010
Volume4

Conference

Conference2010 7th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2010
Country/TerritoryChina
CityYantai, Shandong
Period10/08/1012/08/10

Fingerprint

Dive into the research topics of 'Automatic filtering algorithm for imbalanced classification'. Together they form a unique fingerprint.

Cite this