Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data

  • Yawen Xiao*
  • , Jun Wu
  • , Zongli Lin
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

Background and objective: Cancer is a serious global disease due to its high mortality, and the key to effective treatment is accurate diagnosis. However, limited by sampling difficulty and actual sample size in clinical practice, data imbalance is a common problem in cancer diagnosis, while most conventional classification methods assume balanced data distribution. Therefore, addressing the imbalanced learning problem to improve the predictive performance of cancer diagnosis is significant. Methods: In the study, we dissect the data imbalance prevalent in cancer gene expression data and present an improved deep learning based Wasserstein generative adversarial network (WGAN) model, which provides a reliable training progress indicator and deeply explores the characteristics of data. The WGAN generates new samples from the minority class and solves the imbalance problem at the data level. Results: We analyze three publicly available data sets on RNA-seq of three kinds of cancer using the proposed WGAN and compare the results with those from two commonly adopted sampling methods. According to the results, through addressing the data imbalance problem, the balanced data distribution and the expanding sample size increase the prediction accuracy in all three data sets. Conclusions: Therefore, the proposed WGAN method is superior in solving the imbalanced learning problem of gene expression data, providing significantly better prediction performance in cancer diagnosis.

Original languageEnglish
Article number104540
JournalComputers in Biology and Medicine
Volume135
DOIs
StatePublished - Aug 2021

Keywords

  • Cancer diagnosis
  • Deep learning
  • Gene expression data
  • Imbalanced data
  • Wasserstein generative adversarial networks

Fingerprint

Dive into the research topics of 'Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data'. Together they form a unique fingerprint.

Cite this