跳到主要导航 跳到搜索 跳到主要内容

Uni-to-Multi Modal Knowledge Distillation for Bidirectional LiDAR-Camera Semantic Segmentation

  • East China Normal University
  • Shanghai Key Laboratory Of Computer Software Evaluating And Testing
  • Central South University
  • Xiamen University

科研成果: 期刊稿件文章同行评审

摘要

Combining LiDAR points and images for robust semantic segmentation has shown great potential. However, the heterogeneity between the two modalities (e.g. the density, the field of view) poses challenges in establishing a bijective mapping between each point and pixel. This modality alignment problem introduces new challenges in network design and data processing for cross-modal methods. Specifically, 1) points that are projected outside the image planes; 2) the complexity of maintaining geometric consistency limits the deployment of many data augmentation techniques. To address these challenges, we propose a cross-modal knowledge imputation and transition approach. First, we introduce a bidirectional feature fusion strategy that imputes missing image features and performs cross-modal fusion simultaneously. This allows us to generate reliable predictions even when images are missing. Second, we propose a Uni-to-Multi modal Knowledge Distillation (U2MKD) framework, leveraging the transfer of informative features from a single-modality teacher to a cross-modality student. This overcomes the issues of augmentation misalignment and enables us to train the student effectively. Extensive experiments on the nuScenes, Waymo, and SemanticKITTI datasets demonstrate the effectiveness of our approach. Notably, our method achieves an 8.3 mIoU gain over the LiDAR-only baseline on the nuScenes validation set and achieves state-of-the-art performance on the three datasets.

源语言英语
页(从-至)11059-11072
页数14
期刊IEEE Transactions on Pattern Analysis and Machine Intelligence
46
12
DOI
出版状态已出版 - 2024

指纹

探究 'Uni-to-Multi Modal Knowledge Distillation for Bidirectional LiDAR-Camera Semantic Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此