Identification of Genes Involved in Breast Cancer Metastasis by Integrating Protein-Protein Interaction Information with Expression Data

  • Xin Tian
  • , Mingyuan Xin
  • , Jian Luo
  • , Mingyao Liu*
  • , Zhenran Jiang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The selection of relevant genes for breast cancer metastasis is critical for the treatment and prognosis of cancer patients. Although much effort has been devoted to the gene selection procedures by use of different statistical analysis methods or computational techniques, the interpretation of the variables in the resulting survival models has been limited so far. This article proposes a new Random Forest (RF)-based algorithm to identify important variables highly related with breast cancer metastasis, which is based on the important scores of two variable selection algorithms, including the mean decrease Gini (MDG) criteria of Random Forest and the GeneRank algorithm with protein-protein interaction (PPI) information. The new gene selection algorithm can be called PPIRF. The improved prediction accuracy fully illustrated the reliability and high interpretability of gene list selected by the PPIRF approach.

Original languageEnglish
Pages (from-to)172-182
Number of pages11
JournalJournal of Computational Biology
Volume24
Issue number2
DOIs
StatePublished - Feb 2017

Keywords

  • GeneRank algorithm
  • PPI information.
  • breast cancer metastasis

Fingerprint

Dive into the research topics of 'Identification of Genes Involved in Breast Cancer Metastasis by Integrating Protein-Protein Interaction Information with Expression Data'. Together they form a unique fingerprint.

Cite this