TY - GEN
T1 - Low-Redundancy Knowledge Generation and Modality-Aware Interaction for Multimodal Information Extraction in Social Media
AU - Huang, Shizhou
AU - Xu, Bo
AU - Li, Changqun
AU - Yu, Yang
AU - Lin, Xin
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Multimodal information extraction (MIE) has gained increasing attention, as it helps to accomplish information extraction by adding images as auxiliary information. By acquiring entity-related knowledge, knowledge generation methods can effectively enhance the performance of information extraction models. However, current knowledge generation methods have two weaknesses: (1) they often generate knowledge that includes task-irrelevant information causing redundancy and negatively impacting model performance; (2) they typically concatenate knowledge and text input directly together, ignoring the stylistic and contextual differences arising from their different sources. To address these issues, we propose Low-Redundancy Knowledge Generation and Modality-Aware Interaction (LRKG-MAI). Our approach leverages a large language model to generate task-relevant knowledge with minimal redundancy, while treating knowledge as a distinct modality that interacts with text within its own representation space. Extensive experiments demonstrate the effectiveness of our approach. The source code can be found at https://github.com/JinFish/LRKG-MAI.
AB - Multimodal information extraction (MIE) has gained increasing attention, as it helps to accomplish information extraction by adding images as auxiliary information. By acquiring entity-related knowledge, knowledge generation methods can effectively enhance the performance of information extraction models. However, current knowledge generation methods have two weaknesses: (1) they often generate knowledge that includes task-irrelevant information causing redundancy and negatively impacting model performance; (2) they typically concatenate knowledge and text input directly together, ignoring the stylistic and contextual differences arising from their different sources. To address these issues, we propose Low-Redundancy Knowledge Generation and Modality-Aware Interaction (LRKG-MAI). Our approach leverages a large language model to generate task-relevant knowledge with minimal redundancy, while treating knowledge as a distinct modality that interacts with text within its own representation space. Extensive experiments demonstrate the effectiveness of our approach. The source code can be found at https://github.com/JinFish/LRKG-MAI.
KW - knowledge generation
KW - knowledge interaction
KW - multimodal information extraction
KW - social media
UR - https://www.scopus.com/pages/publications/105022643965
U2 - 10.1109/ICME59968.2025.11209770
DO - 10.1109/ICME59968.2025.11209770
M3 - 会议稿件
AN - SCOPUS:105022643965
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2025 IEEE International Conference on Multimedia and Expo
PB - IEEE Computer Society
T2 - 2025 IEEE International Conference on Multimedia and Expo, ICME 2025
Y2 - 30 June 2025 through 4 July 2025
ER -