TY - GEN
T1 - PLATO-TTA
T2 - 33rd ACM International Conference on Multimedia, MM 2025
AU - Xie, Jianxiang
AU - Wu, Yao
AU - Zhang, Yachao
AU - Zhang, Xiaopei
AU - Xie, Yuan
AU - Qu, Yanyun
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/10/27
Y1 - 2025/10/27
N2 - Multi-modal test-time adaptation (TTA) for 3D semantic segmentation has increasingly become a research hotspot due to its ability to address label dependency and enable rapid adaptation. Existing methods rely on learnable extra components to mitigate reliability bias, however, learning-based approaches in TTA scenarios often lack sufficient training. Moreover, most existing approaches update only normalization layers in the teacher-student framework, which limits their ability to model domain shifts. To overcome these limitations, we propose PLATO-TTA, a novel multi-modal TTA method for 3D semantic segmentation leveraging the native stability in robust prototypes and adaptive tuning of critical teacher-student parameters. The approach contains three key components: Prototype-Guided Pseudo-Labeling (PGPL), Consistency Based Backtracking (CBB), and Domain Specific Updating (DSU). PGPL reduces reliability bias by constructing pseudo-source domain prototypes and computing modality fusion weights based on domain discrepancies. CBB updates all student model parameters while preventing catastrophic forgetting through a parameter backtracking mechanism. DSU selectively updates the teacher model using only domain-specific parameters from the student model, ensuring rapid adaptation and stable guidance. Extensive experiments demonstrate the effectiveness of PLATO-TTA, bringing a 6.3% gain to the SynthiatoSemanticKITTI scenario with severe reliability bias and significant domain discrepancy, and achieve state-of-the-art performance across various domain adaptation scenarios.
AB - Multi-modal test-time adaptation (TTA) for 3D semantic segmentation has increasingly become a research hotspot due to its ability to address label dependency and enable rapid adaptation. Existing methods rely on learnable extra components to mitigate reliability bias, however, learning-based approaches in TTA scenarios often lack sufficient training. Moreover, most existing approaches update only normalization layers in the teacher-student framework, which limits their ability to model domain shifts. To overcome these limitations, we propose PLATO-TTA, a novel multi-modal TTA method for 3D semantic segmentation leveraging the native stability in robust prototypes and adaptive tuning of critical teacher-student parameters. The approach contains three key components: Prototype-Guided Pseudo-Labeling (PGPL), Consistency Based Backtracking (CBB), and Domain Specific Updating (DSU). PGPL reduces reliability bias by constructing pseudo-source domain prototypes and computing modality fusion weights based on domain discrepancies. CBB updates all student model parameters while preventing catastrophic forgetting through a parameter backtracking mechanism. DSU selectively updates the teacher model using only domain-specific parameters from the student model, ensuring rapid adaptation and stable guidance. Extensive experiments demonstrate the effectiveness of PLATO-TTA, bringing a 6.3% gain to the SynthiatoSemanticKITTI scenario with severe reliability bias and significant domain discrepancy, and achieve state-of-the-art performance across various domain adaptation scenarios.
KW - 3d semantic segmentation
KW - multi-modal learning
KW - test-time adaptation
UR - https://www.scopus.com/pages/publications/105024062058
U2 - 10.1145/3746027.3755793
DO - 10.1145/3746027.3755793
M3 - 会议稿件
AN - SCOPUS:105024062058
T3 - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
SP - 2226
EP - 2234
BT - MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
PB - Association for Computing Machinery, Inc
Y2 - 27 October 2025 through 31 October 2025
ER -