Deep Video-Based Performance Synthesis from Sparse Multi-View Capture

  • Mingjia Chen
  • , Changbo Wang*
  • , Ligang Liu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We present a deep learning based technique that enables novel-view videos of human performances to be synthesized from sparse multi-view captures. While performance capturing from a sparse set of videos has received significant attention, there has been relatively less progress which is about non-rigid objects (e.g., human bodies). The rich articulation modes of human body make it rather challenging to synthesize and interpolate the model well. To address this problem, we propose a novel deep learning based framework that directly predicts novel-view videos of human performances without explicit 3D reconstruction. Our method is a composition of two steps: novel-view prediction and detail enhancement. We first learn a novel deep generative query network for view prediction. We synthesize novel-view performances from a sparse set of just five or less camera videos. Then, we use a new generative adversarial network to enhance fine-scale details of the first step results. This opens up the possibility of high-quality low-cost video-based performance synthesis, which is gaining popularity for VA and AR applications. We demonstrate a variety of promising results, where our method is able to synthesis more robust and accurate performances than existing state-of-the-art approaches when only sparse views are available.

Original languageEnglish
Pages (from-to)543-554
Number of pages12
JournalComputer Graphics Forum
Volume38
Issue number7
DOIs
StatePublished - 1 Oct 2019

Keywords

  • CCS Concepts
  • Image-based rendering
  • • Computing methodologies → Computer graphics

Fingerprint

Dive into the research topics of 'Deep Video-Based Performance Synthesis from Sparse Multi-View Capture'. Together they form a unique fingerprint.

Cite this