TY - JOUR
T1 - Multiscale Brain-Like Neural Network for Saliency Prediction on Omnidirectional Images
AU - Zhu, Dandan
AU - Chen, Yongqing
AU - Zhao, Defang
AU - Zhu, Yucheng
AU - Zhou, Qiangqiang
AU - Zhai, Guangtao
AU - Yang, Xiaokang
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2022/6/1
Y1 - 2022/6/1
N2 - Current top-performing saliency prediction methods of omnidirectional images (ODIs) depend on deep feedforward convolutional neural networks (CNNs), benefiting from their powerful multiscale representation ability. Although these methods adopt deep feedforward CNNs to achieve superb performance in saliency prediction task, they have the following limitations: 1) these deep feedforward CNNs are difficult to map to ventral stream structure of the brain visual system due to their vast number of layers and missing biologically important connections, such as recurrence and 2) most deep feedforward CNNs represent the multiscale features in a layerwise manner. To tackle these issues, models that could learn multiscale features yet share the similarities with human brain are needed. In this article, we propose a novel multiscale brain-like network (MBN) model to predict saliency of head fixations on ODIs. Specifically, our proposed model consists of two major modules: 1) a brain-like CORnet-S module and 2) a multiscale feature extraction module. The CORnet-S module is a lightweight backbone network with four anatomically mapped areas (V1, V2, V4, and IT) and it can simulate the visual processing mechanism of ventral visual stream in the human brain. The multiscale feature extraction module is inspired by the multiscale brain structure, which represents multiscale features at a granular level and increases the range of receptive fields for each network layer. Extensive experiments and ablation studies conducted on two major benchmarks demonstrate the superiority of the proposed MBN model over the state-of-the-art methods.
AB - Current top-performing saliency prediction methods of omnidirectional images (ODIs) depend on deep feedforward convolutional neural networks (CNNs), benefiting from their powerful multiscale representation ability. Although these methods adopt deep feedforward CNNs to achieve superb performance in saliency prediction task, they have the following limitations: 1) these deep feedforward CNNs are difficult to map to ventral stream structure of the brain visual system due to their vast number of layers and missing biologically important connections, such as recurrence and 2) most deep feedforward CNNs represent the multiscale features in a layerwise manner. To tackle these issues, models that could learn multiscale features yet share the similarities with human brain are needed. In this article, we propose a novel multiscale brain-like network (MBN) model to predict saliency of head fixations on ODIs. Specifically, our proposed model consists of two major modules: 1) a brain-like CORnet-S module and 2) a multiscale feature extraction module. The CORnet-S module is a lightweight backbone network with four anatomically mapped areas (V1, V2, V4, and IT) and it can simulate the visual processing mechanism of ventral visual stream in the human brain. The multiscale feature extraction module is inspired by the multiscale brain structure, which represents multiscale features at a granular level and increases the range of receptive fields for each network layer. Extensive experiments and ablation studies conducted on two major benchmarks demonstrate the superiority of the proposed MBN model over the state-of-the-art methods.
KW - Brain-like neural network
KW - lightweight model
KW - multiscale features
KW - omnidirectional image
KW - saliency prediction
UR - https://www.scopus.com/pages/publications/85099723095
U2 - 10.1109/TCDS.2021.3052526
DO - 10.1109/TCDS.2021.3052526
M3 - 文章
AN - SCOPUS:85099723095
SN - 2379-8920
VL - 14
SP - 507
EP - 518
JO - IEEE Transactions on Cognitive and Developmental Systems
JF - IEEE Transactions on Cognitive and Developmental Systems
IS - 2
ER -