Alternate Geometric and Semantic Denoising Diffusion for Protein Inverse Folding

  • Chenglin Wang
  • , Yucheng Zhou
  • , Zhe Wang
  • , Zijie Zhai
  • , Jianbing Shen
  • , Kai Zhang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Protein inverse folding is a fundamental problem in bioinformatics, aiming to recover the amino acid sequences from a given protein backbone structure. Despite the success of existing methods, they still have two limitations: (1) widely used topological modeling via GNNs may not effectively integrate geometric context of the entire protein 3D structure by focusing on only local residue message passing, and (2) current denoising processes primarily rely on geometric relations to update residue representations, while neglecting the semantic and functional correlations between different amino acid types. In this work, we propose an Alternate Geometric and Semantic Denoising Diffusion (AGSDD) that performs two types of denoising, i.e., geometric denoising and semantic denoising in turn, in the joint Geo-semantic residue representation space: (1) the geometric denoising module uses a geometric contextual aggregator to encode global contextual information from the entire protein structure and selectively distributes information to each residue; and (2) the semantic denoising module uses a learnable key-value dictionary of residue-types to facilitate communication between them so that learned residue features can be more accurately aligned to proper residue types. In experiments, we conduct extensive evaluations on the CATH4.2, TS50 and TS500 datasets, and observe that even without using any pre-trained protein language models, AGSDD still outperforms leading methods, achieving state-of-the-art performance and exhibiting strong generalization capabilities.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Proceedings
EditorsRita P. Ribeiro, Alípio M. Jorge, Bernhard Pfahringer, Nathalie Japkowicz, Pedro Larrañaga, Carlos Soares, Pedro H. Abreu, João Gama
PublisherSpringer Science and Business Media Deutschland GmbH
Pages350-366
Number of pages17
ISBN (Print)9783032060655
DOIs
StatePublished - 2026
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025 - Porto, Portugal
Duration: 15 Sep 202519 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16015 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2025
Country/TerritoryPortugal
CityPorto
Period15/09/2519/09/25

Keywords

  • Alternate Denoising
  • Diffusion Model
  • Protein Inverse Folding

Fingerprint

Dive into the research topics of 'Alternate Geometric and Semantic Denoising Diffusion for Protein Inverse Folding'. Together they form a unique fingerprint.

Cite this