Decoupled dynamic group equivariant filter for saliency prediction on omnidirectional image

  • Dandan Zhu
  • , Kaiwei Zhang
  • , Guokai Zhang*
  • , Qiangqiang Zhou
  • , Xiongkuo Min
  • , Guangtao Zhai
  • , Xiaokang Yang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Current saliency prediction models based on convolutional neural networks (CNNs) achieve solid improvement in predicting human attention on omnidirectional image (ODI). However, these models that employ standard convolution have two main shortcomings: content-agnostic and computation-intensive. To address these two shortcomings, we propose a decoupled dynamic group equivariant filter (DDGF). Specifically, inspired by the attention mechanism that adopts light-weight branches for estimating spatial and channel attention, we decouple group equivariant convolution (i.e. p4 convolution) into spatial and channel dynamic group equivariant filters. Such a design not only makes p4 convolution filter adaptive to ODI content, but also considerably reduces computational cost. To our best knowledge, the DDGF is the first decoupled dynamic convolution filter that applied to the task of saliency prediction. Meanwhile, we observe that it is effective and efficient when replacing standard group equivariant convolution with DDGF in ODI saliency prediction. Experimental results show that the proposed DDGF can achieve superior performance in comparison with other state-of-the-art methods. Additionally, we conduct ablation experiments to verify the effectiveness of each component of the proposed DDGF.

Original languageEnglish
Pages (from-to)111-121
Number of pages11
JournalNeurocomputing
Volume518
DOIs
StatePublished - 21 Jan 2023
Externally publishedYes

Keywords

  • Content-adaptive
  • Decoupled dynamic group equivariant filter
  • Lightweight model
  • Omnidirectional image
  • Saliency prediction

Fingerprint

Dive into the research topics of 'Decoupled dynamic group equivariant filter for saliency prediction on omnidirectional image'. Together they form a unique fingerprint.

Cite this