跳到主要导航 跳到搜索 跳到主要内容

HiFun: Homology independent protein function prediction by a novel protein-language self-Attention model

  • Jun Wu
  • , Haipeng Qing
  • , Jian Ouyang
  • , Jiajia Zhou
  • , Zihao Gao
  • , Christopher E. Mason
  • , Zhichao Liu
  • , Tieliu Shi*
  • *此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Protein function prediction based on amino acid sequence alone is an extremely challenging but important task, especially in metagenomics/metatranscriptomics field, in which novel proteins have been uncovered exponentially from new microorganisms. Many of them are extremely low homology to known proteins and cannot be annotated with homology-based or information integrative methods. To overcome this problem, we proposed a Homology Independent protein Function annotation method (HiFun) based on a unified deep-learning model by reassembling the sequence as protein language. The robustness of HiFun was evaluated using the benchmark datasets and metrics in the CAFA3 challenge. To navigate the utility of HiFun, we annotated 2 212 663 unknown proteins and discovered novel motifs in the UHGP-50 catalog. We proved that HiFun can extract latent function related structure features which empowers it ability to achieve function annotation for non-homology proteins. HiFun can substantially improve newly proteins annotation and expand our understanding of microorganisms' adaptation in various ecological niches. Moreover, we provided a free and accessible webservice at http://www.unimd.org/HiFun, requiring only protein sequences as input, offering researchers an efficient and practical platform for predicting protein functions.

源语言英语
文章编号bbad311
期刊Briefings in Bioinformatics
24
5
DOI
出版状态已出版 - 1 9月 2023

指纹

探究 'HiFun: Homology independent protein function prediction by a novel protein-language self-Attention model' 的科研主题。它们共同构成独一无二的指纹。

引用此