摘要
While diffusion models have demonstrated impressive performance, there is a growing need for generating samples tailored to specific user-defined concepts. The customized requirements promote the development of few-shot diffusion models, which use limited nta target samples to fine-tune a pre-trained diffusion model trained on ns source samples. Despite the empirical success, no theoretical work specifically analyzes few-shot diffusion models. Moreover, the existing results for diffusion models without a fine-tuning phase can not explain why few-shot models generate great samples due to the curse of dimensionality. In this work, we analyze few-shot diffusion models under a linear structure distribution with a latent dimension d. From the approximation perspective, we prove that few-shot models have a Oe(n−s2/d + nta−1/2) bound to approximate the target score function, which is better than n−ta2/d results. From the optimization perspective, we consider a latent Gaussian special case and prove that the optimization problem has a closed-form minimizer. This means few-shot models can directly obtain an approximated minimizer without a complex optimization process. Furthermore, we also provide the accuracy bound Oe(1/nta + 1/√ns) for the empirical solution, which still has better dependence on nta compared to ns. The results of the real-world experiments also show that the models obtained by only fine-tuning the encoder and decoder specific to the target distribution can produce novel images with the target feature, which supports our theoretical results.
| 源语言 | 英语 |
|---|---|
| 期刊 | Advances in Neural Information Processing Systems |
| 卷 | 37 |
| 出版状态 | 已出版 - 2024 |
| 活动 | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大 期限: 9 12月 2024 → 15 12月 2024 |
指纹
探究 'Few-Shot Diffusion Models Escape the Curse of Dimensionality' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver