GCAT: Gated Convolutional Attention Transformer for Efficient Image Super-Resolution

  • Zhijian Wu
  • , Kaiyi Feng
  • , Dingjiang Huang*
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

Recently, Transformer-based methods have achieved impressive performance in many computer vision tasks (e.g., image super-resolution (SR)) due to the advantages of long-range modeling. However, the computational cost requirement renders these methods unsuitable on resource-constrain devices, especially for image SR tasks involving high-resolution images. In this paper, we propose a concise and effective Gated Convolutional Attention Unit (GCAU) that uses cheap convolutional operations. Specifically, GCAU consists of Convolutional Transposed Attention (CTA) and Locally-enhanced Gating (LeG) in parallel. The former allows for efficient modeling of the global relational interactions by calculating cross-covariance across channels dimension, while the latter controls the information flow from the former directing the network to focus on more refined image attributes. Without bells and whistles, we present a simple SR Transformer GCAT by cascading the GCAUs. Extensive experimental results demonstrate that our GCAT achieves state-of-the-art performance among the existing efficient SR methods with significantly less complexity. Especially, GCAT is on average 5× faster than SwinIR-light with comparable performance.

Keywords

  • Efficient Network
  • Gated Linear Unit
  • Image Super-resolution
  • Transformer

Fingerprint

Dive into the research topics of 'GCAT: Gated Convolutional Attention Transformer for Efficient Image Super-Resolution'. Together they form a unique fingerprint.

Cite this