Injecting-Diffusion: Inject Domain-Independent Contents into Diffusion Models for Unpaired Image-to-Image Translation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Diffusion models have shown remarkable performance in the task of image synthesis. However, we notice that existing methods fail to preserve domain-independent contents of the input images, making it challenging for unpaired image-to-image translation. To address this issue, we proposed a diffusion model for domain-independent content injecting. We propose a domain-independent content extractor to obtain domain-independent contents from the source domain. After that, we inject the extracted contents into the diffusion model and fuse them with domain-specific appearances of the target domain through our proposed cross-domain attention mechanism. The qualitative and quantitative experiments demonstrate that our proposed method can generate high-fidelity images of the target domain while preserving domain-independent contents of the source domain.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023
PublisherIEEE Computer Society
Pages282-287
Number of pages6
ISBN (Electronic)9781665468916
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE International Conference on Multimedia and Expo, ICME 2023 - Brisbane, Australia
Duration: 10 Jul 202314 Jul 2023

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2023-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2023 IEEE International Conference on Multimedia and Expo, ICME 2023
Country/TerritoryAustralia
CityBrisbane
Period10/07/2314/07/23

Keywords

  • diffusion models
  • domain-independent contents
  • unpaired image-to-image translation

Fingerprint

Dive into the research topics of 'Injecting-Diffusion: Inject Domain-Independent Contents into Diffusion Models for Unpaired Image-to-Image Translation'. Together they form a unique fingerprint.

Cite this