UGNet: Uncertainty aware geometry enhanced networks for stereo matching

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Stereo matching is a fundamental research area in the field of computer vision. In recent years, iterative methods based on Gated Recurrent Units (GRUs) have showcased remarkable achievements in this domain. Despite their high accuracy, these methods suffer from notable limitations such as a reliance on a large number of iterations and a tendency to lose high-frequency details. To address these issues, we propose a novel uncertainty-aware framework that combines 3D convolution and GRU-based iterations, aiming to improve efficiency and accuracy. Specifically, we first introduce a probabilistic method to jointly train the disparity map and its corresponding uncertainty map using 3D convolutions. Next, leveraging the uncertainty map as a guide, we develop a novel uncertainty reweighting iterative module to assist in identifying errors in the coarse disparity and cost volume, thereby refining the disparity estimation process and significantly improving the iteration efficiency. Moreover, we introduce a high-resolution refinement module that utilizes Pixel Difference Convolution (PDC) to incorporate additional gradient information. This module can fine-tune the disparity estimation to enhance accuracy. Finally, our network is evaluated on multiple widely-used benchmark datasets. The results demonstrate its proficiency in predicting precise boundaries and effectively reduce iterations. Our model achieves comparable performance to other state-of-the-art methods, ranking 1st on KITTI 2015, and 2nd on KITTI 2012. These results validate its strong performance and generalizability.

Original languageEnglish
Article number110410
JournalPattern Recognition
Volume151
DOIs
StatePublished - Jul 2024

Keywords

  • Disparity regression
  • Geometry-enhanced
  • Stereo matching
  • Uncertainty guidance

Fingerprint

Dive into the research topics of 'UGNet: Uncertainty aware geometry enhanced networks for stereo matching'. Together they form a unique fingerprint.

Cite this