TY - GEN
T1 - Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation
AU - Zhang, Ling
AU - Yun, Boxiang
AU - Li, Qingli
AU - Wang, Yan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Automated pathology report generation from Whole Slide Images (WSIs) faces two key challenges: (1) lack of semantic content in visual features and (2) inherent information redundancy in WSIs. To address these issues, we propose a novel Historical Report Guided Bi-modal Concurrent Learning Framework for Pathology Report Generation (BiGen) emulating pathologists’ diagnostic reasoning, consisting of: (1) A knowledge retrieval mechanism to provide rich semantic content, which retrieves WSI-relevant knowledge from pre-built medical knowledge bank by matching high-attention patches and (2) A bi-modal concurrent learning strategy instantiated via a learnable visual token and a learnable textual token to dynamically extract key visual features and retrieved knowledge, where weight-shared layers enable cross-modal alignment between visual features and knowledge features. Our multi-modal decoder integrates both modals for comprehensive diagnostic reports generation. Experiments on the PathText (BRCA) dataset demonstrate our framework’s superiority, achieving state-of-the-art performance with 7.4% relative improvement in NLP metrics and 19.1% enhancement in classification metrics for Her-2 prediction versus existing methods. Ablation studies validate the necessity of our proposed modules, highlighting our method’s ability to provide WSI-relevant rich semantic content and suppress information redundancy in WSIs. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/BiGen.
AB - Automated pathology report generation from Whole Slide Images (WSIs) faces two key challenges: (1) lack of semantic content in visual features and (2) inherent information redundancy in WSIs. To address these issues, we propose a novel Historical Report Guided Bi-modal Concurrent Learning Framework for Pathology Report Generation (BiGen) emulating pathologists’ diagnostic reasoning, consisting of: (1) A knowledge retrieval mechanism to provide rich semantic content, which retrieves WSI-relevant knowledge from pre-built medical knowledge bank by matching high-attention patches and (2) A bi-modal concurrent learning strategy instantiated via a learnable visual token and a learnable textual token to dynamically extract key visual features and retrieved knowledge, where weight-shared layers enable cross-modal alignment between visual features and knowledge features. Our multi-modal decoder integrates both modals for comprehensive diagnostic reports generation. Experiments on the PathText (BRCA) dataset demonstrate our framework’s superiority, achieving state-of-the-art performance with 7.4% relative improvement in NLP metrics and 19.1% enhancement in classification metrics for Her-2 prediction versus existing methods. Ablation studies validate the necessity of our proposed modules, highlighting our method’s ability to provide WSI-relevant rich semantic content and suppress information redundancy in WSIs. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/BiGen.
KW - Image Caption
KW - Pathology Report Generation
KW - Whole Slide Image
UR - https://www.scopus.com/pages/publications/105017844965
U2 - 10.1007/978-3-032-04978-0_33
DO - 10.1007/978-3-032-04978-0_33
M3 - 会议稿件
AN - SCOPUS:105017844965
SN - 9783032049773
T3 - Lecture Notes in Computer Science
SP - 343
EP - 352
BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
A2 - Gee, James C.
A2 - Hong, Jaesung
A2 - Sudre, Carole H.
A2 - Golland, Polina
A2 - Alexander, Daniel C.
A2 - Iglesias, Juan Eugenio
A2 - Venkataraman, Archana
A2 - Kim, Jong Hyo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Y2 - 23 September 2025 through 27 September 2025
ER -