A scalable monocular 3D detector with Superpixel Feature Pyramid Network

Dongliang Ma, Fang Zhao, Ye Li, Xin Qu, Xin Jiang, Hao Wu, Xi Chen, Min Liu

Research output: Contribution to journalArticlepeer-review

Abstract

Monocular 3D object detection plays a pivotal role in vehicle perception systems. Current methods frequently struggle to effectively extract scene-level semantic information, and the availability of monocular 3D detectors tailored to diverse embedded devices with varying computing power may still be limited. This paper introduces MonoYolo, a scalable detector designed for practicality and efficiency with varying resource constraints. In particular, we design a Superpixel Feature Pyramid Network (SFPN) that automatically groups pixels with similar attributes together. Experimental results on KITTI and nuScenes datasets showcase the advantageous performance of MonoYolo over superior monocular detectors for large models, while the lightweight model maintains real-time detection capabilities. Meanwhile, the proposed SFPN offers a seamless integration into existing image-only 3D detectors, presenting a plug-and-play solution for enhanced monocular 3D object detection performance.

Original languageEnglish
Article number113389
JournalApplied Soft Computing
Volume180
DOIs
StatePublished - Aug 2025

Keywords

  • Monocular 3D object detection
  • Scalable detector
  • Superpixel
  • Vehicle perception system

Fingerprint

Dive into the research topics of 'A scalable monocular 3D detector with Superpixel Feature Pyramid Network'. Together they form a unique fingerprint.

Cite this