TY - GEN
T1 - RadLAS
T2 - 33rd ACM International Conference on Multimedia, MM 2025
AU - Liu, Yihang
AU - Wen, Ying
AU - Yang, Longzhen
AU - He, Lianghua
AU - Shen, Heng Tao
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/10/27
Y1 - 2025/10/27
N2 - Medical Foundation Models (MFMs) are revolutionizing radiography image analysis with scalable and generalized diagnostic capabilities. However, their effectiveness in real-world clinical practice is limited due to insufficient interpretability. To address this limitation, we propose RadLAS, a novel MFM for interpretable Radiographic image analysis by introducing Lesion-Aware Self-supervised pre-training. Unlike conventional MFMs that rely on post-hoc explanations, RadLAS innovates by directly emulating human diagnostic reasoning to first grounding lesion evidence and then making decisions accordingly. Specifically, RadLAS introduces two self-supervised tasks: (I) Lesion-grounded Reconstruction, which learns structured anatomical representations by restoring lesion-aware image patches into their healthy counterparts, thereby facilitating pixel-level grounding of lesion evidence via input-normal contrast. (II) Lesion-discrimination Contrastive Learning, which enhances lesion-aware pattern in representations by explicitly decoupling grounded lesion evidence as clinical cues and aligning them with global semantics, thereby enabling direct lesion-oriented diagnosis while preserving global context. RadLAS demonstrates excellent performance across diverse downstream radiographic datasets, offering verifiable explanations by deriving specific diagnoses (Task II) based on grounded lesion evidence (Task I), while preserving generalized representations essential for high diagnostic accuracy. Extensive experiments demonstrate that RadLAS (i) achieves superior interpretability with highly correlated lesion prediction and localization, surpassing 11 interpretable medical models; (ii) delivers scalable representation learning, outperforming 14 SOTA supervised and self-supervised MFMs.
AB - Medical Foundation Models (MFMs) are revolutionizing radiography image analysis with scalable and generalized diagnostic capabilities. However, their effectiveness in real-world clinical practice is limited due to insufficient interpretability. To address this limitation, we propose RadLAS, a novel MFM for interpretable Radiographic image analysis by introducing Lesion-Aware Self-supervised pre-training. Unlike conventional MFMs that rely on post-hoc explanations, RadLAS innovates by directly emulating human diagnostic reasoning to first grounding lesion evidence and then making decisions accordingly. Specifically, RadLAS introduces two self-supervised tasks: (I) Lesion-grounded Reconstruction, which learns structured anatomical representations by restoring lesion-aware image patches into their healthy counterparts, thereby facilitating pixel-level grounding of lesion evidence via input-normal contrast. (II) Lesion-discrimination Contrastive Learning, which enhances lesion-aware pattern in representations by explicitly decoupling grounded lesion evidence as clinical cues and aligning them with global semantics, thereby enabling direct lesion-oriented diagnosis while preserving global context. RadLAS demonstrates excellent performance across diverse downstream radiographic datasets, offering verifiable explanations by deriving specific diagnoses (Task II) based on grounded lesion evidence (Task I), while preserving generalized representations essential for high diagnostic accuracy. Extensive experiments demonstrate that RadLAS (i) achieves superior interpretability with highly correlated lesion prediction and localization, surpassing 11 interpretable medical models; (ii) delivers scalable representation learning, outperforming 14 SOTA supervised and self-supervised MFMs.
KW - interpretable radiography image analysis
KW - medical foundation model
KW - self-supervised representation learning
UR - https://www.scopus.com/pages/publications/105024073321
U2 - 10.1145/3746027.3754915
DO - 10.1145/3746027.3754915
M3 - 会议稿件
AN - SCOPUS:105024073321
T3 - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
SP - 10847
EP - 10856
BT - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
PB - Association for Computing Machinery, Inc
Y2 - 27 October 2025 through 31 October 2025
ER -