FrDiff: Framelet-Based Conditional Diffusion Model for Multispectral and Panchromatic Image Fusion

Junkang Zhang, Faming Fang*, Tingting Wang, Guixu Zhang, Haichuan Song*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The process of fusing low-resolution multispectral (LRMS) and high-resolution panchromatic (PAN) imagery, commonly referred to as pansharpening, is intended to generate high-resolution multispectral (HRMS) imagery. Typically, most pre-existing pansharpening frameworks mainly emphasize the straightforward learning of the mapping relationship among PAN and LRMS images to HRMS images. However, a key limitation of these frameworks is their potential overemphasis on spatial information, particularly the enhancement of low-frequency components. As a result, such an oversight potentially hinders the model's ability to simultaneously restore both spectral and spatial details. To address this issue, we propose a novel pansharpening model based on the denoising diffusion probabilistic model (DDPM), dubbed FrDiff. Specifically, we build a framelet-based conditional diffusion model that leverages the generative power of diffusion models to produce more refine results. Different from conventional methods directly inferring HRMS images, our strategy is designed to project their framelet coefficients, utilizing the available PAN and LRMS images as resources. This approach enables the separation of high-frequency and low-frequency components through framelet transformation, which are subsequently recombined to create a novel set of conditional embeddings that feed into the diffusion process. At the same time, the powerful predictive power of the diffusion model is exploited to simultaneously recover the high-frequency and low-frequency components of the HRMS. Moreover, we introduce a framelet-oriented cross-attention module dedicated to honing spectral fidelity. This module is crucial for improving the spectral precision of the HRMS images, ensuring a balanced emphasis on both spatial and spectral enhancements. Quantitative and qualitative experiments on multiple benchmark datasets demonstrate that the proposed method achieves more robustness and high-quality results than other state-of-the-art pansharpening methods.

Original languageEnglish
Pages (from-to)5989-6002
Number of pages14
JournalIEEE Transactions on Multimedia
Volume27
DOIs
StatePublished - 2025

Keywords

  • Convolutional neural network
  • diffusion model
  • framelet transform
  • pansharpening

Fingerprint

Dive into the research topics of 'FrDiff: Framelet-Based Conditional Diffusion Model for Multispectral and Panchromatic Image Fusion'. Together they form a unique fingerprint.

Cite this