Dual focus attention network for video emotion recognition

Haonan Qiu, Liang He, Feng Wang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Video emotion recognition is a challenging task due to complex scenes and various forms of emotion expression. Most existing works focus on fusing multiple features over the whole video clips. According to our observations, given a long video clip, the emotion is usually presented by only several actions/objects in a few short snippets, and the meaningful cues are buried in the noisy background. When human judging the emotion in videos, we first find the informative clips and then closely look for emotional cues in the frames. In this paper, we propose Dual Focus Attention Network to mimic this process. First, three kinds of features including action, object, and scene are extracted from videos. Second, Two attention modules are used to focus on the visual features of the videos from temporal and spatial dimensions respectively. With our dual focus attention network, we can effectively discover the most emotional frames along the time dimension and the most emotional visual cues in each frame. Our experiments conducted on two widely used datasets Ekman and VideoEmotion show that our proposed approach outperforms the existing approaches.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Multimedia and Expo, ICME 2020
PublisherIEEE Computer Society
ISBN (Electronic)9781728113319
DOIs
StatePublished - Jul 2020
Event2020 IEEE International Conference on Multimedia and Expo, ICME 2020 - London, United Kingdom
Duration: 6 Jul 202010 Jul 2020

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2020-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2020 IEEE International Conference on Multimedia and Expo, ICME 2020
Country/TerritoryUnited Kingdom
CityLondon
Period6/07/2010/07/20

Keywords

  • Attention for video
  • Deep learning
  • Video emotion recognition

Fingerprint

Dive into the research topics of 'Dual focus attention network for video emotion recognition'. Together they form a unique fingerprint.

Cite this