Spatiotemporal Inconsistency Learning for DeepFake Video Detection

Zhihao Gu, Yang Chen, Taiping Yao, Shouhong DIng, Jilin Li, Feiyue Huang, Lizhuang Ma

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

130 Scopus citations

Abstract

The rapid development of facial manipulation techniques has aroused public concerns in recent years. Following the success of deep learning, existing methods always formulate DeepFake video detection as a binary classification problem and develop frame-based and video-based solutions. However, little attention has been paid to capturing the spatial-temporal inconsistency in forged videos. To address this issue, we term this task as a Spatial-Temporal Inconsistency Learning (STIL) process and instantiate it into a novel STIL block, which consists of a Spatial Inconsistency Module (SIM), a Temporal Inconsistency Module (TIM), and an Information Supplement Module (ISM). Specifically, we present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions. And the ISM simultaneously utilizes the spatial information from SIM and temporal information from TIM to establish a more comprehensive spatial-temporal representation. Moreover, our STIL block is flexible and could be plugged into existing 2D CNNs. Extensive experiments and visualizations are presented to demonstrate the effectiveness of our method against the state-of-the-art competitors.

Original languageEnglish
Title of host publicationMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages3473-3481
Number of pages9
ISBN (Electronic)9781450386517
DOIs
StatePublished - 17 Oct 2021
Externally publishedYes
Event29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, China
Duration: 20 Oct 202124 Oct 2021

Publication series

NameMM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

Conference

Conference29th ACM International Conference on Multimedia, MM 2021
Country/TerritoryChina
CityVirtual, Online
Period20/10/2124/10/21

Keywords

  • deepfake video detection
  • spatiotemporal inconsistency modeling
  • video analysis

Fingerprint

Dive into the research topics of 'Spatiotemporal Inconsistency Learning for DeepFake Video Detection'. Together they form a unique fingerprint.

Cite this