TY - GEN
T1 - STAIR
T2 - 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
AU - Xu, Cong
AU - He, Yunhang
AU - Wang, Jun
AU - Zhang, Wei
N1 - Publisher Copyright:
Copyright © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2025/4/11
Y1 - 2025/4/11
N2 - While the mining of modalities is the focus of most multimodal recommendation methods, we believe that how to fully utilize both collaborative and multimodal information is pivotal in e-commerce scenarios where, as clarified in this work, the user behaviors are rarely determined entirely by multimodal features. In order to combine the two distinct types of information, some additional challenges are encountered: 1) Modality erasure: Vanilla graph convolution, which proves rather useful in collaborative filtering, however erases multimodal information; 2) Modality forgetting: Multimodal information tends to be gradually forgotten as the recommendation loss essentially facilitates the learning of collaborative information. To this end, we propose a novel approach named STAIR, which employs a novel STepwise grAph convolution to enable a co-existence of collaborative and multimodal Information in e-commerce Recommendation. Besides, it starts with the raw multimodal features as an initialization, and the forgetting problem can be significantly alleviated through constrained embedding updates. As a result, STAIR achieves state-of-the-art recommendation performance on three public e-commerce datasets with minimal computational and memory costs.
AB - While the mining of modalities is the focus of most multimodal recommendation methods, we believe that how to fully utilize both collaborative and multimodal information is pivotal in e-commerce scenarios where, as clarified in this work, the user behaviors are rarely determined entirely by multimodal features. In order to combine the two distinct types of information, some additional challenges are encountered: 1) Modality erasure: Vanilla graph convolution, which proves rather useful in collaborative filtering, however erases multimodal information; 2) Modality forgetting: Multimodal information tends to be gradually forgotten as the recommendation loss essentially facilitates the learning of collaborative information. To this end, we propose a novel approach named STAIR, which employs a novel STepwise grAph convolution to enable a co-existence of collaborative and multimodal Information in e-commerce Recommendation. Besides, it starts with the raw multimodal features as an initialization, and the forgetting problem can be significantly alleviated through constrained embedding updates. As a result, STAIR achieves state-of-the-art recommendation performance on three public e-commerce datasets with minimal computational and memory costs.
UR - https://www.scopus.com/pages/publications/105003904229
U2 - 10.1609/aaai.v39i12.33407
DO - 10.1609/aaai.v39i12.33407
M3 - 会议稿件
AN - SCOPUS:105003904229
T3 - Proceedings of the AAAI Conference on Artificial Intelligence
SP - 12899
EP - 12907
BT - Special Track on AI Alignment
A2 - Walsh, Toby
A2 - Shah, Julie
A2 - Kolter, Zico
PB - Association for the Advancement of Artificial Intelligence
Y2 - 25 February 2025 through 4 March 2025
ER -