TY - GEN
T1 - Self-supervised Exclusive Learning for 3D Segmentation with Cross-Modal Unsupervised Domain Adaptation
AU - Zhang, Yachao
AU - Li, Miaoyu
AU - Xie, Yuan
AU - Li, Cuihua
AU - Wang, Cong
AU - Zhang, Zhizhong
AU - Qu, Yanyun
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/10
Y1 - 2022/10/10
N2 - 2D-3D unsupervised domain adaptation (UDA) tackles the lack of annotations in a new domain by capitalizing the relationship between 2D and 3D data. Existing methods achieve considerable improvements by performing cross-modality alignment in a modality-agnostic way, failing to exploit modality-specific characteristic for modeling complementarity. In this paper, we present self-supervised exclusive learning for cross-modal semantic segmentation under the UDA scenario, which avoids the prohibitive annotation. Specifically, two self-supervised tasks are designed, named "plane-to-spatial"and "discrete-to-textured". The former helps the 2D network branch improve the perception of spatial metrics, and the latter supplements structured texture information for the 3D network branch. In this way, modality-specific exclusive information can be effectively learned, and the complementarity of multi-modality is strengthened, resulting in a robust network to different domains. With the help of the self-supervised tasks supervision, we introduce a mixed domain to enhance the perception of the target domain by mixing the patches of the source and target domain samples. Besides, we propose a domain-category adversarial learning with category-wise discriminators by constructing the category prototypes for learning domain-invariant features. We evaluate our method on various multi-modality domain adaptation settings, where our results significantly outperform both uni-modality and multi-modality state-of-the-art competitors.
AB - 2D-3D unsupervised domain adaptation (UDA) tackles the lack of annotations in a new domain by capitalizing the relationship between 2D and 3D data. Existing methods achieve considerable improvements by performing cross-modality alignment in a modality-agnostic way, failing to exploit modality-specific characteristic for modeling complementarity. In this paper, we present self-supervised exclusive learning for cross-modal semantic segmentation under the UDA scenario, which avoids the prohibitive annotation. Specifically, two self-supervised tasks are designed, named "plane-to-spatial"and "discrete-to-textured". The former helps the 2D network branch improve the perception of spatial metrics, and the latter supplements structured texture information for the 3D network branch. In this way, modality-specific exclusive information can be effectively learned, and the complementarity of multi-modality is strengthened, resulting in a robust network to different domains. With the help of the self-supervised tasks supervision, we introduce a mixed domain to enhance the perception of the target domain by mixing the patches of the source and target domain samples. Besides, we propose a domain-category adversarial learning with category-wise discriminators by constructing the category prototypes for learning domain-invariant features. We evaluate our method on various multi-modality domain adaptation settings, where our results significantly outperform both uni-modality and multi-modality state-of-the-art competitors.
KW - cross-modality
KW - mixed domain
KW - self-supervised exclusive learning
KW - semantic segmentation
KW - unsupervised domain adaptation
UR - https://www.scopus.com/pages/publications/85151139982
U2 - 10.1145/3503161.3547987
DO - 10.1145/3503161.3547987
M3 - 会议稿件
AN - SCOPUS:85151139982
T3 - MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
SP - 3338
EP - 3346
BT - MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 30th ACM International Conference on Multimedia, MM 2022
Y2 - 10 October 2022 through 14 October 2022
ER -