GCFA: Generative class feature fusion with agent attention for medical text classification

  • Ye Wang
  • , Qingyan Wang
  • , Hong Yu
  • , Jiang Xie
  • , Feng Hu
  • , Xiaoling Wang
  • , Dajiang Lei*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Recent advances in Generative Artificial Intelligence (GenAI), particularly large language models (LLMs), have introduced novel paradigms for long-tailed medical text classification. This task is challenging because tailed classes suffer from severe data scarcity while requiring a comprehensive understanding of domain-specific medical information. To this end, we propose a Generative Class feature Fusion with Agent attention (GCFA) model, which leverages LLM-driven data generation and information fusion to enhance feature representations and mitigate data imbalance. Specifically, a generative head-tailed fusion strategy is proposed, which generates tailed samples by strategically fusing semantically diverse features from both head and tailed distributions. This ensures that generated samples retain tail-class identity while enriching their semantic diversity. Then, we design a prompt-based medical terminology learning method, where LLMs can mine critical, especially some low-frequency medical terms, from three public datasets to construct a medical vocabulary dictionary. This dictionary guides our Medical Agent Attention Mechanism, enabling targeted emphasis on important medical terms. Extensive experiments demonstrate that GCFA achieves state-of-the-art performance across all evaluated datasets. Our code is available: https://github.com/WQYwqy123456/GCFA-123#.

Original languageEnglish
Article number103639
JournalInformation Fusion
Volume126
DOIs
StatePublished - Feb 2026

Keywords

  • Generative artificial intelligence
  • Medical agent attention
  • Medical text classification

Fingerprint

Dive into the research topics of 'GCFA: Generative class feature fusion with agent attention for medical text classification'. Together they form a unique fingerprint.

Cite this