TY - GEN
T1 - SU-SAM
T2 - 2025 IEEE International Conference on Multimedia and Expo, ICME 2025
AU - Song, Yiran
AU - Zhou, Qianyu
AU - Lu, Xuequan
AU - Shao, Zhiwen
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Segment Anything Model (SAM) excels in common vision tasks but struggles with specialized data. Recent methods fine-tune SAM using parameter-efficient techniques and task-specific designs, but they rely heavily on handcrafting and pre/post-processing, limiting the generalizability. In this paper, we propose SU-SAM, a simple and unified framework that adapts SAM efficiently without task-specific designs, improving its adaptability to underperforming scenes. SU-SAM abstracts parameter-efficient modules into basic design elements, offering four variants: series, parallel, mixed, and LoRA structures. Experiments across nine datasets and six tasks, including medical and defect segmentation, demonstrate SU-SAM's superior performance. We analyze the effectiveness of different parameter-efficient designs and present a generalized model and benchmark, highlighting SU-SAM's adaptability across diverse datasets.
AB - Segment Anything Model (SAM) excels in common vision tasks but struggles with specialized data. Recent methods fine-tune SAM using parameter-efficient techniques and task-specific designs, but they rely heavily on handcrafting and pre/post-processing, limiting the generalizability. In this paper, we propose SU-SAM, a simple and unified framework that adapts SAM efficiently without task-specific designs, improving its adaptability to underperforming scenes. SU-SAM abstracts parameter-efficient modules into basic design elements, offering four variants: series, parallel, mixed, and LoRA structures. Experiments across nine datasets and six tasks, including medical and defect segmentation, demonstrate SU-SAM's superior performance. We analyze the effectiveness of different parameter-efficient designs and present a generalized model and benchmark, highlighting SU-SAM's adaptability across diverse datasets.
KW - Adapter
KW - Foundation Models
KW - Generalizability
KW - Segment Anything Model
KW - Underperformed Scenes
UR - https://www.scopus.com/pages/publications/105022639360
U2 - 10.1109/ICME59968.2025.11209423
DO - 10.1109/ICME59968.2025.11209423
M3 - 会议稿件
AN - SCOPUS:105022639360
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2025 IEEE International Conference on Multimedia and Expo
PB - IEEE Computer Society
Y2 - 30 June 2025 through 4 July 2025
ER -