TY - GEN
T1 - A Novel Framework for Realistic 3D Scene Regeneration with Graph of Thoughts
AU - Kou, Yitian
AU - Zhang, Kaiwei
AU - Zhu, Dandan
AU - Min, Xiongkuo
AU - Zhai, Guangtao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In embodied intelligence applications, highly realistic 3D scenes lay the foundation for perception and decision-making, while 3D scene regeneration creates more coherent and personalized virtual spaces, facilitating more efficient task adaptation and agent training. To address this, we propose a reasoning framework based on the Graph of Thoughts (GoT), which enhances the prompting capabilities of large language models (LLM) and integrates a synergistic mechanism of retrospective memory and feedback loops into the regeneration process. During the initial generation phase, we retain the Holodeck paradigm, combining LLM-driven scene design inferences with the spatial layout of 3D assets from Objaverse. In the regeneration phase, dynamic feedback loops trigger backtracking of reasoning memory to adjust relevant elements according to evolving requirements, while maintaining stability and consistency in unrelated elements, ensuring the scene's overall coherence. We conduct both subjective and objective experiments to validate the effectiveness of this framework, demonstrating significant improvements in 3D scene generation.
AB - In embodied intelligence applications, highly realistic 3D scenes lay the foundation for perception and decision-making, while 3D scene regeneration creates more coherent and personalized virtual spaces, facilitating more efficient task adaptation and agent training. To address this, we propose a reasoning framework based on the Graph of Thoughts (GoT), which enhances the prompting capabilities of large language models (LLM) and integrates a synergistic mechanism of retrospective memory and feedback loops into the regeneration process. During the initial generation phase, we retain the Holodeck paradigm, combining LLM-driven scene design inferences with the spatial layout of 3D assets from Objaverse. In the regeneration phase, dynamic feedback loops trigger backtracking of reasoning memory to adjust relevant elements according to evolving requirements, while maintaining stability and consistency in unrelated elements, ensuring the scene's overall coherence. We conduct both subjective and objective experiments to validate the effectiveness of this framework, demonstrating significant improvements in 3D scene generation.
KW - 3D scene regeneration
KW - embodied intelligence
KW - graph of thoughts
KW - large language model
UR - https://www.scopus.com/pages/publications/105022610296
U2 - 10.1109/ICME59968.2025.11209258
DO - 10.1109/ICME59968.2025.11209258
M3 - 会议稿件
AN - SCOPUS:105022610296
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2025 IEEE International Conference on Multimedia and Expo
PB - IEEE Computer Society
T2 - 2025 IEEE International Conference on Multimedia and Expo, ICME 2025
Y2 - 30 June 2025 through 4 July 2025
ER -