TY - GEN
T1 - Wandering and feeling the Scenes
T2 - 7th ACM International Conference on Multimedia in Asia, MMAsia 2025
AU - Zhang, Chong
AU - Gong, Jingyu
AU - Lin, Shaohui
AU - Li, Yang
AU - Zhang, Zhizhong
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/12/6
Y1 - 2025/12/6
N2 - As demand for virtual digital characters grows in fields such as virtual reality, gaming, and animation, generating highly controllable human motion within scenes has become a key research focus. Existing methods for scene-aware motion generation typically rely on global alignment or latent space matching, which provides limited control over the fine-grained movements of individual body parts. This limitation often leads to rigid and unrealistic motions when interacting with complex environments. Therefore, we propose the Body-Aware Interaction Diffusion Model (BA-IDM), which enables fine-grained control of human motion within a scene by leveraging multimodal information. Text descriptions, motion scenes, and movement trajectories can all serve as inputs, allowing for precise control of each body part and facilitating the generation of a wide range of complex actions. Moreover, our approach is designed to operate on de-identified motion data, effectively protecting user privacy throughout the process, which is essential for practical and user-centric applications.
AB - As demand for virtual digital characters grows in fields such as virtual reality, gaming, and animation, generating highly controllable human motion within scenes has become a key research focus. Existing methods for scene-aware motion generation typically rely on global alignment or latent space matching, which provides limited control over the fine-grained movements of individual body parts. This limitation often leads to rigid and unrealistic motions when interacting with complex environments. Therefore, we propose the Body-Aware Interaction Diffusion Model (BA-IDM), which enables fine-grained control of human motion within a scene by leveraging multimodal information. Text descriptions, motion scenes, and movement trajectories can all serve as inputs, allowing for precise control of each body part and facilitating the generation of a wide range of complex actions. Moreover, our approach is designed to operate on de-identified motion data, effectively protecting user privacy throughout the process, which is essential for practical and user-centric applications.
KW - Human motion generation
KW - Privacy-aware modeling
KW - Scene interaction
UR - https://www.scopus.com/pages/publications/105025107494
U2 - 10.1145/3743093.3770934
DO - 10.1145/3743093.3770934
M3 - 会议稿件
AN - SCOPUS:105025107494
T3 - Proceedings of the 7th ACM International Conference on Multimedia in Asia, MMAsia 2025
BT - Proceedings of the 7th ACM International Conference on Multimedia in Asia, MMAsia 2025
A2 - Chua, Tat-Seng
A2 - Wong, Lai-Kuan
A2 - Chan, Chee Seng
A2 - Tang, Jinhui
A2 - Ngo, Chong-Wah
A2 - Schoeffmann, Klaus
A2 - Liu, Jiaying
A2 - Ho, Yo-Sung
PB - Association for Computing Machinery, Inc
Y2 - 9 December 2025 through 12 December 2025
ER -