TY - JOUR
T1 - Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors
AU - Sun, Meijian
AU - Wang, Xia
AU - Zou, Chuanxin
AU - He, Zenghui
AU - Liu, Wei
AU - Li, Honglin
N1 - Publisher Copyright:
© 2016 The Author(s).
PY - 2016/6/7
Y1 - 2016/6/7
N2 - Background: RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. Results: In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. Conclusions: The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind.
AB - Background: RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. Results: In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. Conclusions: The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind.
KW - Protein-RNA interactions
KW - Random forest classifier
KW - Residue electrostatic surface potential
KW - Residue triplet interface propensity
KW - Structural analysis
UR - https://www.scopus.com/pages/publications/84974575018
U2 - 10.1186/s12859-016-1110-x
DO - 10.1186/s12859-016-1110-x
M3 - 文章
C2 - 27266516
AN - SCOPUS:84974575018
SN - 1471-2105
VL - 17
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 231
ER -