EvaSR: Rethinking Efficient Visual Attention Design for Image Super-Resolution

  • Zhijian Wu
  • , Chenhan Zhang
  • , Dingjiang Huang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Due to the advantages of long-range modeling via the self-attention mechanism, Transformer has taken various vision tasks by storm, including image super-resolution (SR). In this study, we reveal that the convolutional neural network (CNN) with proper visual attention is a more simple and effective paradigm than Transformer in image SR tasks. We reexamine the successful SR models and discover several key characteristics that contribute to accurate image reconstruction. Built on this recipe, we propose a pure CNN-based SR network using efficient visual attention, dubbed EvaSR. Benefiting from the carefully designed visual attention, our EvaSR can favorably capture both local structure and long-range dependencies, and achieve adaptivity in spatial and channel dimensions while retaining the simplicity and efficiency of CNNs. The experimental results demonstrate that our EvaSR achieves state-of-the-art performance among the existing efficient SR methods. Especially, the tiny version of EvaSR needs 21.4% and 15.2% parameters of IMDN and SMSR with better performance.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
EditorsBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368741
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India
Duration: 6 Apr 202511 Apr 2025

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/TerritoryIndia
CityHyderabad
Period6/04/2511/04/25

Keywords

  • Efficient Network
  • Gated Linear Unit
  • Image Super-resolution
  • Transformer

Fingerprint

Dive into the research topics of 'EvaSR: Rethinking Efficient Visual Attention Design for Image Super-Resolution'. Together they form a unique fingerprint.

Cite this