TY - JOUR
T1 - MoDA
T2 - Mixture of Domain Adapters for Parameter-efficient Generalizable Person Re-identification
AU - Wang, Yang
AU - Zhang, Yixing
AU - Ren, Xudie
AU - Deng, Yuxin
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/5/22
Y1 - 2025/5/22
N2 - The Domain Generalizable Re-identification (DG ReID) task has attracted significant attention in recent years, as a challenging task but closely aligned with practical applications. Mixture-of-experts (MoE)-based methods have been studied for DG ReID to exploit the discrepancies and inherent correlations between diverse domains. However, most of DG ReID methods, especially MoE-based methods, have to fully fine-tune a large amount of parameters, which are not always practical in real-world scenarios. Considering this problem, we propose a novel MoE-based DG ReID method, named Mixture of Domain Adapters (MoDA), which utilizes many expert adapters and a global adapter to help MoE-based method scale to a much larger model but in a more parameter-efficient way. Furthermore, we conduct our approach with the large-scale vision-language pre-trained model CLIP, which exploits both visual and text encoders, to learn more robust representations based on multimodal information. Extensive experiments verify the effectiveness of our method and show that MoDA achieves competitiveness with state-of-the-art DG ReID methods with much fewer tunable parameters.
AB - The Domain Generalizable Re-identification (DG ReID) task has attracted significant attention in recent years, as a challenging task but closely aligned with practical applications. Mixture-of-experts (MoE)-based methods have been studied for DG ReID to exploit the discrepancies and inherent correlations between diverse domains. However, most of DG ReID methods, especially MoE-based methods, have to fully fine-tune a large amount of parameters, which are not always practical in real-world scenarios. Considering this problem, we propose a novel MoE-based DG ReID method, named Mixture of Domain Adapters (MoDA), which utilizes many expert adapters and a global adapter to help MoE-based method scale to a much larger model but in a more parameter-efficient way. Furthermore, we conduct our approach with the large-scale vision-language pre-trained model CLIP, which exploits both visual and text encoders, to learn more robust representations based on multimodal information. Extensive experiments verify the effectiveness of our method and show that MoDA achieves competitiveness with state-of-the-art DG ReID methods with much fewer tunable parameters.
KW - Domain Generalization
KW - Generalizable Person Re-Identification
KW - Parameter-efficient Fine-tuning
UR - https://www.scopus.com/pages/publications/105007024413
U2 - 10.1145/3712595
DO - 10.1145/3712595
M3 - 文章
AN - SCOPUS:105007024413
SN - 1551-6857
VL - 21
JO - ACM Transactions on Multimedia Computing, Communications and Applications
JF - ACM Transactions on Multimedia Computing, Communications and Applications
IS - 5
M1 - 139
ER -