TY - GEN
T1 - ScaleNet
T2 - 40th Computer Graphics International Conference, CGI 2023
AU - Feng, Yu
AU - Ma, Tai
AU - Zeng, Hao
AU - Xu, Zhengke
AU - Zhang, Suwei
AU - Wen, Ying
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2024
Y1 - 2024
N2 - Recently, vision transformers have become outstanding segmentation structures for their remarkable global modeling capability. In current transformer-based models for medical image segmentation, convolutional layers are often replaced by transformers, or transformers are added to the deepest layer of the encoder to learn the global context. However, for the extracted multi-scale feature information, most existing methods tend to ignore the multi-scale dependencies, which leads to inadequate feature learning and fails to produce rich feature representations. In this paper, we propose ScaleNet from the perspective of feature interaction at different scales that can alleviate mentioned problems. Specifically, our approach consists of two multi-scale feature interaction modules: the spatial scale interaction (SSI) and the channel scale interaction (CSI). SSI uses a transformer to aggregate patches from different scale features to enhance the feature representations at the spatial scale. CSI uses a 1D convolutional layer and a fully connected layer to perform a global fusion of multi-level features at the channel scale. The combination of CSI and SSI enables ScaleNet to emphasize multi-scale dependencies and effectively resolve complex scale variations.
AB - Recently, vision transformers have become outstanding segmentation structures for their remarkable global modeling capability. In current transformer-based models for medical image segmentation, convolutional layers are often replaced by transformers, or transformers are added to the deepest layer of the encoder to learn the global context. However, for the extracted multi-scale feature information, most existing methods tend to ignore the multi-scale dependencies, which leads to inadequate feature learning and fails to produce rich feature representations. In this paper, we propose ScaleNet from the perspective of feature interaction at different scales that can alleviate mentioned problems. Specifically, our approach consists of two multi-scale feature interaction modules: the spatial scale interaction (SSI) and the channel scale interaction (CSI). SSI uses a transformer to aggregate patches from different scale features to enhance the feature representations at the spatial scale. CSI uses a 1D convolutional layer and a fully connected layer to perform a global fusion of multi-level features at the channel scale. The combination of CSI and SSI enables ScaleNet to emphasize multi-scale dependencies and effectively resolve complex scale variations.
KW - medical image segmentation
KW - multi-organ and skin lesion segmentation tasks
KW - multi-scale feature interaction
KW - transformer-based method
UR - https://www.scopus.com/pages/publications/85180741943
U2 - 10.1007/978-3-031-50078-7_18
DO - 10.1007/978-3-031-50078-7_18
M3 - 会议稿件
AN - SCOPUS:85180741943
SN - 9783031500770
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 222
EP - 236
BT - Advances in Computer Graphics - 40th Computer Graphics International Conference, CGI 2023, Proceedings
A2 - Sheng, Bin
A2 - Bi, Lei
A2 - Kim, Jinman
A2 - Magnenat-Thalmann, Nadia
A2 - Thalmann, Daniel
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 28 August 2023 through 1 September 2023
ER -