Multi modal architecture based on dual attention improvement to assist in glaucoma grading challenges

  • Zehao Li
  • , Qiwen Dong
  • , Lin Wang
  • , Ye Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Diabetic glaucoma, a prevalent complication of diabetes, seriously threatens patients' visual health and ranks as a leading cause of irreversible blindness among diabetic individuals. In clinical practice, accurate grading of diabetic glaucoma is crucial for formulating personalized treatment strategies and predicting disease progression. However, existing methods often rely on single - modality data or lack effective integration of multi-modal information, leading to suboptimal grading performance. This study aims to address these issues by presenting two multi-modal disease grading networks enhanced by the cross-attention mechanism for grading diabetic glaucoma using OCT and color fundus images. Two novel cross-attention based image fusion strategies are developed: one employs the multi - head cross - attention mechanism for better inter-modal information fusion, and the other combines self-attention and cross-attention mechanisms. The experimental results show that using the proposed cross attention based method, the Kappa value of multimodal scoring in the GAMMA challenge reaches 84.4%, and the F1 score is 81%. Our work exceeded the performance of the champion and runner up network models in that year's competition. Also, multi-modal grading accuracy shows a 1% - 4% increase compared to single-modality grading. This research not only improves the accuracy of diabetic glaucoma grading but also provides valuable feature support for future multi-task learning models.

Original languageEnglish
Title of host publicationFourth International Conference on Electronics Technology and Artificial Intelligence, ETAI 2025
EditorsShaohua Luo, Akash Saxena
PublisherSPIE
ISBN (Electronic)9781510693302
DOIs
StatePublished - 2025
Event4th International Conference on Electronics Technology and Artificial Intelligence, ETAI 2025 - Harbin, China
Duration: 21 Feb 202523 Feb 2025

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume13692
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference4th International Conference on Electronics Technology and Artificial Intelligence, ETAI 2025
Country/TerritoryChina
CityHarbin
Period21/02/2523/02/25

Keywords

  • Cross-Attention mechanism
  • Diabetic glaucoma
  • Multi-modal disease grading
  • Self-attention mechanism

Fingerprint

Dive into the research topics of 'Multi modal architecture based on dual attention improvement to assist in glaucoma grading challenges'. Together they form a unique fingerprint.

Cite this