TY - GEN
T1 - A LIGHTWEIGHT SALIENCY PREDICTION MODEL FOR OMNIDIRECTIONAL IMAGES
AU - Zhu, Dandan
AU - Chen, Yongqing
AU - Zhao, Defang
AU - Min, Xiongkuo
AU - Zhou, Qiangqiang
AU - Yu, Shaobo
AU - Zhai, Guangtao
AU - Yang, Xiaokang
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - At present, most high-performing saliency prediction models for omnidirectional images (ODIs) depend on deeper or wider convolutional neural networks (CNNs), benefiting from their superior feature representation capability but suffering from high computational costs. To address this issue, we propose a novel lightweight saliency prediction model to predict the eye fixations on ODIs. Specifically, our proposed model consists of three modules: a lightweight feature representation module, a supervised attention module, and a dynamic convolution aggregation module. Different from the existing saliency prediction models, our proposed model is the first to introduce the dynamic convolution into the saliency prediction and aggregate multiple parallel convolution kernels dynamically based on their attention. Such a dynamic convolution operation is not only computationally efficient (small kernel size), but also increases the feature representation capability since these convolution kernels are aggregated in a non-linear manner via attention. Experimental results on two benchmark datasets show that our model is lightweight and outperforms other state-of-the-art methods.
AB - At present, most high-performing saliency prediction models for omnidirectional images (ODIs) depend on deeper or wider convolutional neural networks (CNNs), benefiting from their superior feature representation capability but suffering from high computational costs. To address this issue, we propose a novel lightweight saliency prediction model to predict the eye fixations on ODIs. Specifically, our proposed model consists of three modules: a lightweight feature representation module, a supervised attention module, and a dynamic convolution aggregation module. Different from the existing saliency prediction models, our proposed model is the first to introduce the dynamic convolution into the saliency prediction and aggregate multiple parallel convolution kernels dynamically based on their attention. Such a dynamic convolution operation is not only computationally efficient (small kernel size), but also increases the feature representation capability since these convolution kernels are aggregated in a non-linear manner via attention. Experimental results on two benchmark datasets show that our model is lightweight and outperforms other state-of-the-art methods.
KW - Omnidirectional images
KW - dynamic convolution network
KW - lightweight model
KW - saliency prediction
KW - supervised attention mechanism
UR - https://www.scopus.com/pages/publications/85126447329
U2 - 10.1109/ICME51207.2021.9428420
DO - 10.1109/ICME51207.2021.9428420
M3 - 会议稿件
AN - SCOPUS:85126447329
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021
PB - IEEE Computer Society
T2 - 2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Y2 - 5 July 2021 through 9 July 2021
ER -