Skip to main navigation Skip to search Skip to main content

NHFNET: A Non-Homogeneous Fusion Network for Multimodal Sentiment Analysis

  • Ziwang Fu
  • , Feng Liu
  • , Qing Xu
  • , Jiayin Qi*
  • , Xiangling Fu*
  • , Aimin Zhou
  • , Zhibin Li
  • *Corresponding author for this work
  • Beijing University of Posts and Telecommunications
  • East China Normal University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Fusion technology is crucial for multimodal sentiment analysis. Recent attention-based fusion methods demonstrate high performance and strong robustness. However, these approaches ignore the difference in information density among the three modalities, i.e., visual and audio have low-level signal features and conversely text has high-level semantic features. To this end, we propose a non-homogeneous fusion network (NHFNet) to achieve multimodal information interaction. Specifically, a fusion module with attention aggregation is designed to handle the fusion of visual and audio modalities to enhance them to high-level semantic features. Then, cross-modal attention is used to achieve information reinforcement of text modality and audio-visual fusion. NHFNet compensates for the differences in information density of different modalities enabling their fair interaction. To verify the effectiveness of the proposed method, we set up the aligned and unaligned experiments on the CMU-MOSEI dataset, respectively. The experimental results show that the proposed method outperforms the state-of-the-art. Codes are available at https://github.com/skeletonNN/NHFNet.

Original languageEnglish
Title of host publicationICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9781665485630
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Multimedia and Expo, ICME 2022 - Taipei, Taiwan, Province of China
Duration: 18 Jul 202222 Jul 2022

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2022-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2022 IEEE International Conference on Multimedia and Expo, ICME 2022
Country/TerritoryTaiwan, Province of China
CityTaipei
Period18/07/2222/07/22

Keywords

  • Multimodal sentiment analysis
  • attention aggregation
  • cross-modal attention
  • fusion

Fingerprint

Dive into the research topics of 'NHFNET: A Non-Homogeneous Fusion Network for Multimodal Sentiment Analysis'. Together they form a unique fingerprint.

Cite this