TY - GEN
T1 - MEScan360
T2 - 2025 IEEE International Conference on Multimedia and Expo, ICME 2025
AU - Zhang, Yuchen
AU - Zhu, Dandan
AU - Zhang, Kaiwei
AU - Jiang, Fei
AU - Zhai, Guangtao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Scanpath prediction for omnidirectional images (ODIs) aims to capture the dynamic human visual attention. However, the complicated gaze behavior and inevitable projection distortion make scanpath prediction in ODIs extremely challenging. Most existing models neither capture the long-term dependencies across visual states nor fully incorporate historical memory information, leading to limited performance. To this end, we propose MEScan360, a memory-enhanced scanpath prediction model for ODIs. We introduce two key innovations: long-term memory storage unit and memory interaction module. These two components establish a more explicit link between past visual information and current visual inputs, thereby significantly enhancing the performance of scanpath prediction. Furthermore, a robust feature extraction module is designed to extract semantic feature precisely from distorted ODIs with a more lightweight structure. Extensive experiments on several benchmark datasets demonstrate that our proposed model achieves competitive performance in both accuracy and efficiency.
AB - Scanpath prediction for omnidirectional images (ODIs) aims to capture the dynamic human visual attention. However, the complicated gaze behavior and inevitable projection distortion make scanpath prediction in ODIs extremely challenging. Most existing models neither capture the long-term dependencies across visual states nor fully incorporate historical memory information, leading to limited performance. To this end, we propose MEScan360, a memory-enhanced scanpath prediction model for ODIs. We introduce two key innovations: long-term memory storage unit and memory interaction module. These two components establish a more explicit link between past visual information and current visual inputs, thereby significantly enhancing the performance of scanpath prediction. Furthermore, a robust feature extraction module is designed to extract semantic feature precisely from distorted ODIs with a more lightweight structure. Extensive experiments on several benchmark datasets demonstrate that our proposed model achieves competitive performance in both accuracy and efficiency.
KW - long-term memory storage unit
KW - memory interaction module
KW - omnidirectional images
KW - robust features
KW - Scanpath prediction
UR - https://www.scopus.com/pages/publications/105022647379
U2 - 10.1109/ICME59968.2025.11210113
DO - 10.1109/ICME59968.2025.11210113
M3 - 会议稿件
AN - SCOPUS:105022647379
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2025 IEEE International Conference on Multimedia and Expo
PB - IEEE Computer Society
Y2 - 30 June 2025 through 4 July 2025
ER -