A Multimodal Trustworthy Joint Perception Prediction Model for Autonomous Driving

  • Yixiao Liu
  • , Lei Zhang*
  • , Qian Xu
  • , Yan Sun
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The emergence of vision-focused joint perception and prediction (PnP) marks a novel trend in the field of autonomous driving research. It predicts the future states of traffic participants within the surrounding environment from perceptual data. However, perception from a single vehicle is insufficient to obtain more accurate environmental information; therefore, the fusion of perception data from different sources and modalities becomes increasingly crucial for processing and predicting environmental data. To this end, this paper proposes a novel multimodal trustworthiness fusion and prediction model. First, we introduce a Bird's-Eye View (BEV) encoder that is synchronized with poses and based on multimodal data. This encoder is capable of projecting raw image inputs from any modality camera, captured at any pose and time, into a shared, synchronized BEV space, thereby enhancing spatiotemporal synchronization. Second, we present a trustworthy Spatial-Temporal Pyramid Transform (TSTPT), which is designed to comprehensively extract multiscale features from BEV and forecast future BEV states, leveraging spatial priors. A comprehensive series of experiments conducted on the KITTI and nuScenes datasets demonstrate that the proposed model is overall feasible more reliable and safe compared to existing vision-based prediction methods.

Original languageEnglish
Title of host publicationProceedings of 2024 8th Asian Conference on Artificial Intelligence Technology, ACAIT 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages133-138
Number of pages6
ISBN (Electronic)9798331517090
DOIs
StatePublished - 2024
Externally publishedYes
Event8th Asian Conference on Artificial Intelligence Technology, ACAIT 2024 - Fuzhou, China
Duration: 8 Nov 202410 Nov 2024

Publication series

NameProceedings of 2024 8th Asian Conference on Artificial Intelligence Technology, ACAIT 2024

Conference

Conference8th Asian Conference on Artificial Intelligence Technology, ACAIT 2024
Country/TerritoryChina
CityFuzhou
Period8/11/2410/11/24

Keywords

  • connected and automated vehicles
  • machine-learning methods
  • multimodal fusion
  • trust management

Fingerprint

Dive into the research topics of 'A Multimodal Trustworthy Joint Perception Prediction Model for Autonomous Driving'. Together they form a unique fingerprint.

Cite this