A machine learning-based method for protein global model quality assessment

  • Qiwen Dong*
  • , Yufei Chen
  • , Shuigeng Zhou
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Model quality assessment is an important task in protein structure prediction, which has been recently identified as one of the bottlenecks limiting the quality and usefulness of protein structure prediction methods. A new prediction category that evaluates the quality of protein models has been implemented since CASP7. In this study, a machine learning-based method for protein global model quality assessment (GMQA) is presented and the web server can be assessed at http://www.iipl.fudan.edu.cn/gmqa/index.html. The GMQA method takes a protein model as input and outputs the predicted MaxSub score that indicates the absolute quality of the protein model. The proposed method extracts the structural features from the 3D coordinates and assigns an absolute quality score to a model by support vector regression. Three types of features are extracted, including secondary structure, relative solvent accessibility, and contact. The closed test is performed on the CASP7 data set using cross validation. An open testing is performed on the LKF data set. In both tests, good correlations between the predicted and true scores are observed. Furthermore, our method is able to discriminate the native or near-native structures from a set of decoys. We also demonstrate that the GMQA method outperforms two existing methods, i.e. ProQ and Victor/FRST. Our results show that GMQA is a useful tool for model quality assessment and ranking.

Original languageEnglish
Pages (from-to)417-425
Number of pages9
JournalInternational Journal of General Systems
Volume40
Issue number4
DOIs
StatePublished - May 2011
Externally publishedYes

Keywords

  • model quality assessment
  • protein structure prediction
  • support vector regression

Fingerprint

Dive into the research topics of 'A machine learning-based method for protein global model quality assessment'. Together they form a unique fingerprint.

Cite this