BEVSOC: Self-Supervised Contrastive Learning for Calibration-Free BEV 3-D Object Detection

  • Yongqing Chen
  • , Nanyu Li
  • , Dandan Zhu
  • , Charles C. Zhou
  • , Zhuhua Hu*
  • , Yong Bai*
  • , Jun Yan*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

3-D object detection based on multiview cameras and bird's-eye view (BEV) representation is a key task for autonomous driving, as it enables the perception systems to understand the surrounding scenes. However, most existing BEV representation methods rely on the projection matrix of camera intrinsic and extrinsic parameters, which requires a complex and time-consuming calibration process that may introduce errors and degrade the detection performance. Moreover, the calibration results may vary due to environmental changes and affect the stability of the detection system. To address this problem, we propose a calibration-free 3-D object detection method that leverages a group-equivariant convolutional network to extract features from multiview images and a projection network module to learn the implicit 3D-to-2D projection relationship for obtaining BEV representation. Furthermore, we employ contrastive learning (CL) to pretrain the projection network module without using manually annotated data. By exploiting the multiview camera data through CL, our proposed method eliminates the need for tedious calibration, avoids calibration errors, and reduces the dependence on a large amount of annotated data for calibration-free 3-D object detection. We evaluate our method on the nuScenes data set and demonstrate its competitive performance. Our method improves the stability and reliability of 3-D object detection in long-term autonomous driving.

Original languageEnglish
Pages (from-to)22167-22182
Number of pages16
JournalIEEE Internet of Things Journal
Volume11
Issue number12
DOIs
StatePublished - 15 Jun 2024

Keywords

  • 3-D object detection
  • calibration free
  • contrastive learning (CL)
  • group equivariant convolution
  • self-supervised

Fingerprint

Dive into the research topics of 'BEVSOC: Self-Supervised Contrastive Learning for Calibration-Free BEV 3-D Object Detection'. Together they form a unique fingerprint.

Cite this