LIPT: Latency-Aware Image Processing Transformer

  • Junbo Qiao
  • , Wei Li
  • , Haizhen Xie
  • , Hanting Chen
  • , Jie Hu
  • , Shaohui Lin*
  • , Jungong Han
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Transformer is leading a trend in the field of image processing. While existing lightweight image processing transformers have achieved notable success, they primarily focus on reducing FLOPs (floating-point operations) or the number of parameters, rather than on practical inference acceleration. In this paper, we present a latency-aware image processing transformer, termed LIPT. We devise the low-latency proportion LIPT block that substitutes memory-intensive operators with the combination of self-attention and convolutions to achieve practical speedup. Specifically, we propose a novel non-volatile sparse masking self-attention (NVSM-SA) that utilizes a pre-computing sparse mask to capture contextual information from a larger window with no extra computation overload. Besides, a high-frequency reparameterization module (HRM) is proposed to make LIPT block reparameterization friendly, enhancing the model's ability to reconstruct fine details. Extensive experiments on multiple image processing tasks (e.g., image super-resolution (SR), JPEG artifact reduction, and image denoising) demonstrate the superiority of LIPT on both latency and PSNR. LIPT achieves real-time GPU inference with state-of-the-art performance on multiple image SR benchmarks. The source codes are released at https://github.com/Lucien66/LIPT

Original languageEnglish
Pages (from-to)3056-3069
Number of pages14
JournalIEEE Transactions on Image Processing
Volume34
DOIs
StatePublished - 2025

Keywords

  • Image processing
  • non-volatile sampling mask
  • reparameterization
  • transformer

Fingerprint

Dive into the research topics of 'LIPT: Latency-Aware Image Processing Transformer'. Together they form a unique fingerprint.

Cite this