RIVIE: Robust Inherent Video Information Embedding

  • Jun Jia
  • , Zhongpai Gao
  • , Dandan Zhu
  • , Xiongkuo Min
  • , Menghan Hu
  • , Guangtao Zhai*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Imagine an interesting situation when watching a movie, we can scan the screen using our smartphones to get some extra information about this movie such as the cast, the release date, the movie's homepage, etc. Our prospect is a world where each video contains invisible information that can be delivered to us through mobile devices with cameras. This paper proposes the first deep learning-based information hiding method for videos to achieve information transmission from screens to cameras. Compared with hiding information in single images, the methods for videos need to maintain visual quality in both spatial and temporal domains. Furthermore, the training of video models builds on a large video dataset, which needs much more computational resources than training models for images. To reduce the computational complexity, we propose to simulate data on-the-fly to generate simulated sequences from single images. Then, we use the simulated data to train a spatio-temporal generator that hides information in videos while maintaining visual quality. During training, a temporal loss function based on the simulated data is exploited to ensure the temporal consistency of generated videos. After embedding, we use a decoder to recover the hidden information. To simulate the imaging pipeline from screens to cameras in the real world, we insert a distortion network between the generator and decoder. The distortion network is based on differentiable 3D rendering to cover possible distortions introduced in the procedure of camera imaging. Experimental results show that the hidden information in videos can be extracted by cameras without impacting the visual quality. Our work can be applied to many fields, such as advertisement, entertainment, and education.

Original languageEnglish
Pages (from-to)7364-7377
Number of pages14
JournalIEEE Transactions on Multimedia
Volume25
DOIs
StatePublished - 2023

Keywords

  • 3D rendering
  • Data hiding
  • adversarial training
  • display-camera communication
  • temporal consistency

Fingerprint

Dive into the research topics of 'RIVIE: Robust Inherent Video Information Embedding'. Together they form a unique fingerprint.

Cite this