Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automated pathology report generation from Whole Slide Images (WSIs) faces two key challenges: (1) lack of semantic content in visual features and (2) inherent information redundancy in WSIs. To address these issues, we propose a novel Historical Report Guided Bi-modal Concurrent Learning Framework for Pathology Report Generation (BiGen) emulating pathologists’ diagnostic reasoning, consisting of: (1) A knowledge retrieval mechanism to provide rich semantic content, which retrieves WSI-relevant knowledge from pre-built medical knowledge bank by matching high-attention patches and (2) A bi-modal concurrent learning strategy instantiated via a learnable visual token and a learnable textual token to dynamically extract key visual features and retrieved knowledge, where weight-shared layers enable cross-modal alignment between visual features and knowledge features. Our multi-modal decoder integrates both modals for comprehensive diagnostic reports generation. Experiments on the PathText (BRCA) dataset demonstrate our framework’s superiority, achieving state-of-the-art performance with 7.4% relative improvement in NLP metrics and 19.1% enhancement in classification metrics for Her-2 prediction versus existing methods. Ablation studies validate the necessity of our proposed modules, highlighting our method’s ability to provide WSI-relevant rich semantic content and suppress information redundancy in WSIs. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/BiGen.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
EditorsJames C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim
PublisherSpringer Science and Business Media Deutschland GmbH
Pages343-352
Number of pages10
ISBN (Print)9783032049773
DOIs
StatePublished - 2026
Event28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of
Duration: 23 Sep 202527 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume15965 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Country/TerritoryKorea, Republic of
CityDaejeon
Period23/09/2527/09/25

Keywords

  • Image Caption
  • Pathology Report Generation
  • Whole Slide Image

Fingerprint

Dive into the research topics of 'Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation'. Together they form a unique fingerprint.

Cite this