TY - GEN
T1 - Text-Guided 3D Object Generation via Disentangled Shape and Appearance Score Distillation Sampling
AU - Chen, Ang
AU - Yi, Ran
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Previous text-guided 3D generative models primarily utilize CLIP-based semantic constraints or employ score distillation sampling from pre-trained 2D diffusion models to provide prior knowledge for neural radiance field optimization. However, these methods lack incorporation of 3D prior knowledge, and are prone to generate poorly structured 3D objects, particularly when the text prompts are not specific. In this paper, we exploit 3D shape prior information from text-guided shape generative models and propose a novel Double Score Distillation Sampling method (Double SDS). Compared to the previous methods that solely use 2D diffusion models in color space, our proposed method leverages both the text-to-shape and text-to-image diffusion models to optimize the disentangled color and density of the neural radiance field, respectively. Additionally, we employ the Low-Rank Adaptation method to fine-tune the pre-trained diffusion models, aiming to enhance the similarity between the generated 3D objects and the target 3D object datasets. Experimental results demonstrate that our proposed method can generate 3D objects with higher visual quality and better geometric structure compared to previous methods.
AB - Previous text-guided 3D generative models primarily utilize CLIP-based semantic constraints or employ score distillation sampling from pre-trained 2D diffusion models to provide prior knowledge for neural radiance field optimization. However, these methods lack incorporation of 3D prior knowledge, and are prone to generate poorly structured 3D objects, particularly when the text prompts are not specific. In this paper, we exploit 3D shape prior information from text-guided shape generative models and propose a novel Double Score Distillation Sampling method (Double SDS). Compared to the previous methods that solely use 2D diffusion models in color space, our proposed method leverages both the text-to-shape and text-to-image diffusion models to optimize the disentangled color and density of the neural radiance field, respectively. Additionally, we employ the Low-Rank Adaptation method to fine-tune the pre-trained diffusion models, aiming to enhance the similarity between the generated 3D objects and the target 3D object datasets. Experimental results demonstrate that our proposed method can generate 3D objects with higher visual quality and better geometric structure compared to previous methods.
KW - Diffusion model
KW - Neural radiance fields
KW - Text-guided 3D generation
UR - https://www.scopus.com/pages/publications/85183319480
U2 - 10.1109/CISP-BMEI60920.2023.10373352
DO - 10.1109/CISP-BMEI60920.2023.10373352
M3 - 会议稿件
AN - SCOPUS:85183319480
T3 - Proceedings - 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2023
BT - Proceedings - 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2023
A2 - Zhao, XiaoMing
A2 - Li, Qingli
A2 - Wang, Lipo
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2023
Y2 - 28 October 2023 through 30 October 2023
ER -