Knowledge Transfer Across Modalities for Weakly Supervised Point Cloud Semantic Segmentation

  • Zihan Wang
  • , Yunhang Shen
  • , Mengtian Li
  • , Ke Li
  • , Xing Sun
  • , Shaohui Lin*
  • , Lizhuang Ma
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

Current weakly supervised point cloud semantic segmentation struggles with insufficient utilization of limited annotations in unimodal representation learning due to the sparse and textureless nature of point clouds. In this work, we leverage cross-modality information by transferring knowledge from image and text sources to the point cloud network. The intuition is that images contribute rich texture, color, and discriminative information, complementing point clouds to boost semantic segmentation performance. To reduce extensive computational resources for cross-modality fusion, we introduce the Multi-Scale Deformable Knowledge Transfer, an innovative training scheme that optimizes and extends the one-to-one mapping to flexible one-to-many relations between multi-modal data. Furthermore, we employ pre-trained image-text models to generate pseudo labels for point clouds and construct positive and negative samples for semantic contrastive regularization, facilitating the full exploitation of unlabeled data. The experimental results evaluated on SemanticKITTI and nuScenes demonstrate substantial improvements, achieving an average gain of 3.8% over the previous weakly supervised methods, and comparable performances to fully supervised approaches.

Keywords

  • Knowledge Transfer
  • Multi-Modal
  • Semantic Segmentation
  • Weakly Supervised

Fingerprint

Dive into the research topics of 'Knowledge Transfer Across Modalities for Weakly Supervised Point Cloud Semantic Segmentation'. Together they form a unique fingerprint.

Cite this