VirulentHunter: deep learning-based virulence factor predictor illuminates pathogenicity in diverse microbial contexts

  • Chen Chen
  • , Yong Xu
  • , Jian Ouyang
  • , Xiangyi Xiong
  • , Paweł P. Łabaj
  • , Agnieszka Chmielarczyk
  • , Anna Różańska
  • , Hao Zhang
  • , Keyang Liu
  • , Tieliu Shi*
  • , Jun Wu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Virulence factors (VFs) are critical determinants of bacterial pathogenicity, but current homology-based identification methods often miss novel or divergent VFs, and many machine learning approaches neglect functional classification. Here, we present VirulentHunter, a novel deep learning framework that enable simultaneous VF identification and classification directly from protein sequences by leveraging the crucial step of fine-tuning pretrained protein language model. We curate a comprehensive VF database by integrating diverse public resources and expanding VF category annotations. Our benchmarking results demonstrate that VirulentHunter outperforms existing methods, particularly in identifying VFs lacking detectable homologs. Additionally, strain-level analysis using VirulentHunter highlights distinct pathogenicity profiles between Mycobacterium tuberculosis and Mycobacterium avium, revealing enrichment in VFs related to adherence, effector delivery systems, and immune modulation in M. tuberculosis, compared to biofilm formation and motility in M. avium. Furthermore, metagenomic profiling of gut microbiota from inflammatory bowel disease patient reveals a depletion of VFs associated with immune homeostasis. These results underscore the versatility of VirulentHunter as a powerful tool for VF analysis across diverse applications. To facilitate broader accessibility, we provide a freely accessible web service for VF prediction (http://www.unimd.org/VirulentHunter), accommodating protein sequences, genomes, and metagenomic data.

Original languageEnglish
Article numberbbaf271
JournalBriefings in Bioinformatics
Volume26
Issue number3
DOIs
StatePublished - 1 May 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. Good health and well being
    Good health and well being

Keywords

  • deep learning
  • homology independent
  • microbial pathogenicity
  • multi-purpose tool
  • virulence factors

Fingerprint

Dive into the research topics of 'VirulentHunter: deep learning-based virulence factor predictor illuminates pathogenicity in diverse microbial contexts'. Together they form a unique fingerprint.

Cite this