TY - JOUR
T1 - Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis
AU - Xiao, Luwei
AU - Mao, Rui
AU - Zhao, Shuai
AU - Lin, Qika
AU - Jia, Yanhao
AU - He, Liang
AU - Cambria, Erik
N1 - Publisher Copyright:
© 2010-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Multimodal aspect-based sentiment classification (MASC) is an emerging task due to an increase in user-generated multimodal content on social platforms, aimed at predicting sentiment polarity toward specific aspect targets (i.e., entities or attributes explicitly mentioned in text-image pairs). Despite extensive efforts and significant achievements in existing MASC, substantial gaps remain in understanding fine-grained visual content and the cognitive rationales derived from semantic content and impressions (cognitive interpretations of emotions evoked by image content). In this study, we present Chimera: a cognitive and aesthetic sentiment causality understanding framework to derive fine-grained holistic features of aspects and infer the fundamental drivers of sentiment expression from both semantic perspectives and affective-cognitive resonance (the synergistic effect between emotional responses and cognitive interpretations). The framework aligns visual patches with words, extracts coarse and fine-grained visual features, translates them into textual descriptions, and uses LLM-generated sentimental causes and impressions to boost sensitivity to affective cues. Experiments on MASC datasets show the model’s effectiveness and greater flexibility compared to LLMs like GPT-4o.
AB - Multimodal aspect-based sentiment classification (MASC) is an emerging task due to an increase in user-generated multimodal content on social platforms, aimed at predicting sentiment polarity toward specific aspect targets (i.e., entities or attributes explicitly mentioned in text-image pairs). Despite extensive efforts and significant achievements in existing MASC, substantial gaps remain in understanding fine-grained visual content and the cognitive rationales derived from semantic content and impressions (cognitive interpretations of emotions evoked by image content). In this study, we present Chimera: a cognitive and aesthetic sentiment causality understanding framework to derive fine-grained holistic features of aspects and infer the fundamental drivers of sentiment expression from both semantic perspectives and affective-cognitive resonance (the synergistic effect between emotional responses and cognitive interpretations). The framework aligns visual patches with words, extracts coarse and fine-grained visual features, translates them into textual descriptions, and uses LLM-generated sentimental causes and impressions to boost sensitivity to affective cues. Experiments on MASC datasets show the model’s effectiveness and greater flexibility compared to LLMs like GPT-4o.
KW - Multimodal aspect-based sentiment classification (MASC)
KW - affective-cognitive resonance
KW - large language models
KW - sentiment causality
UR - https://www.scopus.com/pages/publications/105004076766
U2 - 10.1109/TAFFC.2025.3565506
DO - 10.1109/TAFFC.2025.3565506
M3 - 文章
AN - SCOPUS:105004076766
SN - 1949-3045
VL - 16
SP - 3248
EP - 3265
JO - IEEE Transactions on Affective Computing
JF - IEEE Transactions on Affective Computing
IS - 4
ER -