LFNet: Cross-Modal LiDAR-Fisheye Fusion Network for 3D Semantic Segmentation

Weijian Zhang, Zhiwei Zhang, Tianfang Sun, Zhizhong Zhang, Tan Xin, Yuan Xie*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-modal fusion, which leverages images to enhance 3D semantic segmentation, has demonstrated significant effectiveness due to the complementary nature of heterogeneous data. However, existing approaches are limited to pinhole images, leaving fisheye images largely unexplored. In this paper, we introduce the LiDAR-Fisheye Fusion Network (LFNet), a dual-transformer architecture designed for cross-modal fusion (CMF) across hierarchical multi-scale layers. The 3D Transformer extracts point-level features from LiDAR data, while the pre-trained 2D Transformer extracts patch-level features from fisheye images.The CMF module comprises two key components: Local Fusion (LoF) and Global Fusion (GoF). The LoF module interpolates patch-level features to pixel-level for accurate feature alignment and computes precise point-to-pixel mappings for gated fusion. Meanwhile, the GoF module enables points to capture a holistic understanding of the scene via a cross-modal attention mechanism. Experimental results highlight the potential of fisheye images as a promising modality to complement LiDAR data in 3D semantic segmentation. The code will be available at https://github.com/wjzhang642/LFNet.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Multimedia and Expo
Subtitle of host publicationJourney to the Center of Machine Imagination, ICME 2025 - Conference Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9798331594954
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Multimedia and Expo, ICME 2025 - Nantes, France
Duration: 30 Jun 20254 Jul 2025

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2025 IEEE International Conference on Multimedia and Expo, ICME 2025
Country/TerritoryFrance
CityNantes
Period30/06/254/07/25

Keywords

  • Fisheye Image
  • LiDAR
  • Point Cloud
  • Semantic Segmentation
  • Transformer

Fingerprint

Dive into the research topics of 'LFNet: Cross-Modal LiDAR-Fisheye Fusion Network for 3D Semantic Segmentation'. Together they form a unique fingerprint.

Cite this