Boosting prediction performance on imbalanced dataset

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Mining from imbalance data is an important problem in algorithmic and performance evaluation. When a dataset is imbalanced, the classification technique is not equal considering both the classes. It is obvious that the standard classifiers are not suitable to deal with imbalanced data, since they will likely classify all the instances into the majority class, which is the less important class. Additionally some of the performance measurement, like accuracy - which is known to be a biased metric in the case of imbalance data - does not have a very good performance when the data is imbalanced. In this paper, we tried to apply various techniques used commonly to handle class imbalance, before giving the data to the classifiers. But, the performance of the classifiers is found degrading because of the highly imbalanced nature of the datasets. Hence, we propose an integrated sampling technique with an ensemble of AdaBoost to improve the prediction performance. Meanwhile, through empirical, we show the more appropriate performance measures for mining imbalanced datasets.

Original languageEnglish
Pages (from-to)186-195
Number of pages10
JournalInternational Journal of Information and Communication Technology
Volume13
Issue number2
DOIs
StatePublished - 2018
Externally publishedYes

Keywords

  • Classification
  • Ensemble
  • Imbalanced dataset
  • Re-sampling

Fingerprint

Dive into the research topics of 'Boosting prediction performance on imbalanced dataset'. Together they form a unique fingerprint.

Cite this