A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis

  • Shizhou Huang
  • , Bo Xu
  • , Changqun Li
  • , Jiabo Ye
  • , Xin Lin*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Recently, multimodal sentiment analysis from social media posts has received increasing attention, as it can effectively improve single-modality-based sentiment analysis by leveraging the complementary information between text and images. Despite their success, current methods still suffer from two weaknesses: (1) the current methods for obtaining image representations do not obtain sentiment information, which leads to a significant gap between image representations and results; (2) the current methods ignore the sentiments expressed by the symbols (emoticons, emojis) in the text, but these symbols can effectively reflect the user’s sentiments. To address these issues, we propose a sentimental prompt framework with visual text encoder (SPFVTE). Specifically, for the first problem, instead of using the image representation directly, we project the image representation as a prompt and utilize the prompt learning to capture sentimental information in images by learning a sentiment-specific prompt. For the second problem, considering that people get the meanings of emojis and emoticons from their graphics, we propose to render the text as an image and use a visual text encoder to capture the sentiments contained in emojis and emoticons. We have conducted experiments on three public multimodal sentiment datasets, and the experimental results show that our method can significantly and consistently outperform the state-of-the-art methods. The datasets and source code can be found at https://github.com/JinFish/SPFVTE.

Original languageEnglish
Title of host publicationICMR 2024-Proceedings of the 14th Annual ACM International Conference on Multimedia Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages638-646
Number of pages9
ISBN (Electronic)9798400706028
DOIs
StatePublished - 7 Jun 2024
Event14th Annual ACM International Conference on Multimedia Retrieval, ICMR 2024 - Phuket, Thailand
Duration: 10 Jun 202414 Jun 2024

Publication series

NameICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval

Conference

Conference14th Annual ACM International Conference on Multimedia Retrieval, ICMR 2024
Country/TerritoryThailand
CityPhuket
Period10/06/2414/06/24

Keywords

  • multimodal fusion
  • multimodal sentiment analysis
  • social media posts

Fingerprint

Dive into the research topics of 'A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis'. Together they form a unique fingerprint.

Cite this