TY - GEN
T1 - DiffSynth
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
AU - Duan, Zhongjie
AU - You, Lizhou
AU - Wang, Chengyu
AU - Chen, Cen
AU - Wu, Ziheng
AU - Qian, Weining
AU - Huang, Jun
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - In recent years, diffusion models have emerged as a powerful approach in the field of image synthesis. However, applying these models directly to video synthesis presents challenges, often leading to noticeable flickering in the content. Although recently proposed zero-shot methods can alleviate flickering to some extent, generating coherent videos remains a struggle. In this paper, we propose DiffSynth, a novel approach that converts image synthesis pipelines into video synthesis pipelines. DiffSynth consists of two key components: a latent in-iteration deflickering framework and a video deflickering algorithm. The latent in-iteration deflickering framework applies video deflickering in the latent space of diffusion models, effectively preventing flicker accumulation in intermediate steps. Additionally, we introduce a video deflickering algorithm, named the patch blending algorithm, which remaps objects across different frames and blends them to enhance video consistency. One of the notable advantages of DiffSynth is its general applicability to various video synthesis tasks, including text-guided video stylization, fashion video synthesis, image-guided video stylization, video restoration, and 3D rendering. In the task of text-guided video stylization, we make it possible to synthesize high-quality videos without cherry-picking. The experimental results demonstrate the effectiveness of DiffSynth, and we further showcase its practical value on Alibaba e-commerce platform.
AB - In recent years, diffusion models have emerged as a powerful approach in the field of image synthesis. However, applying these models directly to video synthesis presents challenges, often leading to noticeable flickering in the content. Although recently proposed zero-shot methods can alleviate flickering to some extent, generating coherent videos remains a struggle. In this paper, we propose DiffSynth, a novel approach that converts image synthesis pipelines into video synthesis pipelines. DiffSynth consists of two key components: a latent in-iteration deflickering framework and a video deflickering algorithm. The latent in-iteration deflickering framework applies video deflickering in the latent space of diffusion models, effectively preventing flicker accumulation in intermediate steps. Additionally, we introduce a video deflickering algorithm, named the patch blending algorithm, which remaps objects across different frames and blends them to enhance video consistency. One of the notable advantages of DiffSynth is its general applicability to various video synthesis tasks, including text-guided video stylization, fashion video synthesis, image-guided video stylization, video restoration, and 3D rendering. In the task of text-guided video stylization, we make it possible to synthesize high-quality videos without cherry-picking. The experimental results demonstrate the effectiveness of DiffSynth, and we further showcase its practical value on Alibaba e-commerce platform.
KW - Generative models
KW - Video deflickering
KW - Video synthesis
UR - https://www.scopus.com/pages/publications/85203866909
U2 - 10.1007/978-3-031-70381-2_21
DO - 10.1007/978-3-031-70381-2_21
M3 - 会议稿件
AN - SCOPUS:85203866909
SN - 9783031703805
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 332
EP - 347
BT - Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track - European Conference, ECML PKDD 2024, Proceedings
A2 - Bifet, Albert
A2 - Krilavičius, Tomas
A2 - Miliou, Ioanna
A2 - Nowaczyk, Slawomir
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 9 September 2024 through 13 September 2024
ER -