跳到主要导航 跳到搜索 跳到主要内容

PromptIF: A prompt-based general image fusion framework

科研成果: 期刊稿件文章同行评审

摘要

Multimodal image fusion is a challenging task, involving areas such as visible-infrared fusion, multi-exposure fusion, and multi-focus fusion. These tasks require merging images from different modalities, each with unique characteristics, making it difficult to develop a unified model that can handle all of them effectively. While deep learning has made significant progress in these areas, the inherent differences between image types still present challenges in achieving optimal fusion performance. A unified model could simplify processing and improve results in downstream tasks, such as object detection, semantic segmentation, and scene analysis. Inspired by the success of prompt-based techniques in large models and natural language processing (NLP), we introduce PromptIF, a lightweight and efficient fusion model based on prompts. PromptIF is designed to adapt to different fusion tasks by using minimal extra parameters, which allows it to effectively preserve important image details while also differentiating between tasks. Our results demonstrate that PromptIF not only outperforms both traditional and recent fusion methods but also achieves strong results across various benchmarks and downstream applications. This shows that our approach is both flexible and effective in real-world scenarios. We will release the code to encourage further exploration and development in the field of multimodal image fusion.

源语言英语
文章编号103386
期刊Displays
93
DOI
出版状态已出版 - 7月 2026

指纹

探究 'PromptIF: A prompt-based general image fusion framework' 的科研主题。它们共同构成独一无二的指纹。

引用此