A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data

Yawen Xiao, Jun Wu, Zongli Lin, Xiaodong Zhao

Research output: Contribution to journalArticlepeer-review

93 Scopus citations

Abstract

Background and objective: Cancer has become a complex health problem due to its high mortality. Over the past few decades, with the rapid development of the high-throughput sequencing technology and the application of various machine learning methods, remarkable progress in cancer research has been made based on gene expression data. At the same time, a growing amount of high-dimensional data has been generated, such as RNA-seq data, which calls for superior machine learning methods able to deal with mass data effectively in order to make accurate treatment decision. Methods: In this paper, we present a semi-supervised deep learning strategy, the stacked sparse auto-encoder (SSAE) based classification, for cancer prediction using RNA-seq data. The proposed SSAE based method employs the greedy layer-wise pre-training and a sparsity penalty term to help capture and extract important information from the high-dimensional data and then classify the samples. Results: We tested the proposed SSAE model on three public RNA-seq data sets of three types of cancers and compared the prediction performance with several commonly-used classification methods. The results indicate that our approach outperforms the other methods for all the three cancer data sets in various metrics. Conclusions: The proposed SSAE based semi-supervised deep learning model shows its promising ability to process high-dimensional gene expression data and is proved to be effective and accurate for cancer prediction.

Original languageEnglish
Pages (from-to)99-105
Number of pages7
JournalComputer Methods and Programs in Biomedicine
Volume166
DOIs
StatePublished - Nov 2018

Keywords

  • Cancer prediction
  • Deep learning
  • Gene expression data
  • Semi-supervised learning
  • Stacked sparse auto-encoder

Fingerprint

Dive into the research topics of 'A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data'. Together they form a unique fingerprint.

Cite this