Skip to main navigation Skip to search Skip to main content

Human Attention Based Movie Summarization: Dataset and Baseline Model

  • Defang Zhao
  • , Dandan Zhu*
  • , Xiongkuo Min
  • , Jiaomin Yue
  • , Kaiwei Zhang
  • , Qiangqiang Zhou
  • , Guangtao Zhai
  • , Xiaokang Yang
  • *Corresponding author for this work
  • CloudWalk Technology
  • Donghua University
  • Shanghai Jiao Tong University
  • Jiangxi Normal University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The movie summarization model can automatically edit a condensed and succinct version of the movie by selecting the keyframes. Previous works mainly resort to hand-crafted heuristics and most of them are unsupervised. Supervised movie summarization is a new research field and, there is currently no publicly suitable dataset available. Moreover, existing works only focus on the movies themselves while neglecting the audiences, who have the most say in which part of the movie is more attractive. To deal with the aforementioned limitations, we establish a human attention based movie summarization dataset Movie50. Specifically, we explore the human attention variations when watching videos and have the following findings: (1) The attention of humans is concentrated when watching keyframes. (2) The attention of humans is distracted when watching non-keyframes. Inspired by these findings, we collect the eye fixations of 20 participants when watching 50 movies and propose a novel human attention based annotation pipeline. In addition, we introduce A/V-MSNet, an audiovisual neural network that takes advantage of spatio-temporal visual and auditory information to better model human attention as well as exploit more plentiful information. Extensive experiments demonstrate the superiority of the proposed method.

Original languageEnglish
Title of host publicationICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9781665485630
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 IEEE International Conference on Multimedia and Expo, ICME 2022 - Taipei, Taiwan, Province of China
Duration: 18 Jul 202222 Jul 2022

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2022-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2022 IEEE International Conference on Multimedia and Expo, ICME 2022
Country/TerritoryTaiwan, Province of China
CityTaipei
Period18/07/2222/07/22

Keywords

  • Movie summarization
  • audiovisual
  • human attention
  • keyframes
  • multi-modal

Fingerprint

Dive into the research topics of 'Human Attention Based Movie Summarization: Dataset and Baseline Model'. Together they form a unique fingerprint.

Cite this