AdaptMVSNet: Efficient Multi-View Stereo with adaptive convolution and attention fusion

Pengfei Jiang, Xiaoyan Yang, Yuanjie Chen, Wenjie Song, Yang Li

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Multi-View Stereo (MVS) is a crucial technique for reconstructing the geometric structure of a scene, given the known camera parameters. Previous deep learning-based MVS methods have mainly focused on improving the reconstruction quality but overlooked the running efficiency during the actual algorithm deployment. For example, deformable convolutions have been introduced to improve the accuracy of the reconstruction results further, however, its inability for parallel optimization caused low inference speed. In this paper, we propose AdaptMVSNet which is device-friendly and reconstruction-efficient, while preserving the original results. To this end, adaptive convolution is introduced to significantly improve the efficiency in speed and metrics compared to current methods. In addition, an attention fusion module is proposed to blend features from adaptive convolution and the feature pyramid network. Our experiments demonstrate that our proposed approach achieves state-of-the-art performance and is almost 2× faster than the recent fastest MVS method. We will release our source code.

Original languageEnglish
Pages (from-to)128-138
Number of pages11
JournalComputers and Graphics
Volume116
DOIs
StatePublished - Nov 2023

Keywords

  • 3D reconstruction
  • Computers vision
  • Deep learning
  • Multi-view stereo

Fingerprint

Dive into the research topics of 'AdaptMVSNet: Efficient Multi-View Stereo with adaptive convolution and attention fusion'. Together they form a unique fingerprint.

Cite this