TY - JOUR
T1 - 基于自监督学习的医学影像异常检测
AU - Wang, Nan
AU - Lin, Shaohui
AU - Qi, Fulin
AU - Chen, Yulong
AU - Li, Ke
AU - Shen, Yunhang
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2025 Institute of Computing Technology. All rights reserved.
PY - 2025/3
Y1 - 2025/3
N2 - Self-supervised learning (SSL) can capture generic knowledge about different concepts, thereby beneficial for various downstream image analysis tasks. To address the shortcomings of underutilized multi-modal features in self-supervised learning methods for medical images, a self-supervised learning method considering multi-modal complementary information is proposed, named SLeM. This method first divides the four modalities into four blocks uniformly, these blocks are used to construct multi-modal images by randomly combining them, different multi-modal images are assigned different labels, and the multimodal feature representations can be learned by the classification task. The learned multi-modal features are followed by a contextual fusion block (CFB), which extracts features from tumors of various sizes. Finally, we transfer the learned representation to the downstream multi-modal medical image segmentation task via simple fine-tuning. Experiments conducted on public datasets BraTS and CHAOS were compared with multimodel baselines, including methods based on JiGen, Taleb and Supervoxel, etc. The results show that the segmentation accuracy of whole tumor, tumor core and enhanced tumor are improved by 2.03 percentage points, 3.92 percentage points and 1.75 percentage points, respectively. Meanwhile, the visual effect obtained by this method is also significantly better than others.
AB - Self-supervised learning (SSL) can capture generic knowledge about different concepts, thereby beneficial for various downstream image analysis tasks. To address the shortcomings of underutilized multi-modal features in self-supervised learning methods for medical images, a self-supervised learning method considering multi-modal complementary information is proposed, named SLeM. This method first divides the four modalities into four blocks uniformly, these blocks are used to construct multi-modal images by randomly combining them, different multi-modal images are assigned different labels, and the multimodal feature representations can be learned by the classification task. The learned multi-modal features are followed by a contextual fusion block (CFB), which extracts features from tumors of various sizes. Finally, we transfer the learned representation to the downstream multi-modal medical image segmentation task via simple fine-tuning. Experiments conducted on public datasets BraTS and CHAOS were compared with multimodel baselines, including methods based on JiGen, Taleb and Supervoxel, etc. The results show that the segmentation accuracy of whole tumor, tumor core and enhanced tumor are improved by 2.03 percentage points, 3.92 percentage points and 1.75 percentage points, respectively. Meanwhile, the visual effect obtained by this method is also significantly better than others.
KW - feature extract
KW - medical image segmentation
KW - multi-model fusion
KW - multiscale convolution
KW - self-supervised learning
UR - https://www.scopus.com/pages/publications/105011694029
U2 - 10.3724/SP.J.1089.2023-00339
DO - 10.3724/SP.J.1089.2023-00339
M3 - 文章
AN - SCOPUS:105011694029
SN - 1003-9775
VL - 37
SP - 474
EP - 483
JO - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics
JF - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics
IS - 3
ER -