TY - GEN
T1 - ProRAG
T2 - 30th International Conference on Database Systems for Advanced Applications, DASFAA 2025
AU - Zhou, Yongkang
AU - Yan, Muyang
AU - Yao, Junjie
AU - Xu, Gang
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
PY - 2026
Y1 - 2026
N2 - The concept of a Virtual Human represents an advanced interactive interface that bridges users with digital information, offering an increasingly realistic experience. Recent breakthroughs in Large Language Models (LLMs) and AI-Generated Content (AIGC) have significantly improved the lifelike nature of virtual humans, making them increasingly indistinguishable from real humans. However, this rapid progress raises significant concerns regarding the ethical implications and the reliability of virtual human interactions, particularly in high-stakes, domain-specific scenarios where factual accuracy and trustworthiness are paramount. In response to these challenges, we introduce ProRAG, a novel framework designed to enhance the trustworthiness and reliability of digital avatars. ProRAG combines domain-specific LLMs with innovative strategies to address key challenges such as hallucinations, computational inefficiency, and context stability. Our approach integrates a multimodal knowledge base, consisting of textual, visual, and auditory data, to improve retrieval accuracy and content consistency. Furthermore, ProRAG supports multimodal digital human interactions, facilitating voice, visual, and text communication, which ensures high trust for critical applications. By leveraging adaptive data representation techniques, ProRAG resolves the “Lost in the Middle" challenge, enhancing hallucination suppression and promoting structured knowledge integration. This framework is designed to be scalable and versatile, demonstrating its potential across diverse domains such as education, cultural preservation, and legal consultation, while ensuring the generation of reliable, context-aware content in mission-critical decision-making environments.
AB - The concept of a Virtual Human represents an advanced interactive interface that bridges users with digital information, offering an increasingly realistic experience. Recent breakthroughs in Large Language Models (LLMs) and AI-Generated Content (AIGC) have significantly improved the lifelike nature of virtual humans, making them increasingly indistinguishable from real humans. However, this rapid progress raises significant concerns regarding the ethical implications and the reliability of virtual human interactions, particularly in high-stakes, domain-specific scenarios where factual accuracy and trustworthiness are paramount. In response to these challenges, we introduce ProRAG, a novel framework designed to enhance the trustworthiness and reliability of digital avatars. ProRAG combines domain-specific LLMs with innovative strategies to address key challenges such as hallucinations, computational inefficiency, and context stability. Our approach integrates a multimodal knowledge base, consisting of textual, visual, and auditory data, to improve retrieval accuracy and content consistency. Furthermore, ProRAG supports multimodal digital human interactions, facilitating voice, visual, and text communication, which ensures high trust for critical applications. By leveraging adaptive data representation techniques, ProRAG resolves the “Lost in the Middle" challenge, enhancing hallucination suppression and promoting structured knowledge integration. This framework is designed to be scalable and versatile, demonstrating its potential across diverse domains such as education, cultural preservation, and legal consultation, while ensuring the generation of reliable, context-aware content in mission-critical decision-making environments.
KW - Digital Avatar
KW - Knowledge Integration
KW - Large Language Models
KW - Multi-modal Interaction
KW - Retrieval Augmented Generation
UR - https://www.scopus.com/pages/publications/105028265063
U2 - 10.1007/978-981-95-4158-4_29
DO - 10.1007/978-981-95-4158-4_29
M3 - 会议稿件
AN - SCOPUS:105028265063
SN - 9789819541577
T3 - Lecture Notes in Computer Science
SP - 408
EP - 419
BT - Database Systems for Advanced Applications - 30th International Conference, DASFAA 2025, Proceedings
A2 - Zhu, Feida
A2 - Lim, Ee-Peng
A2 - Yu, Philip S.
A2 - Nadamoto, Akiyo
A2 - Shim, Kyuseok
A2 - Ding, Wei
A2 - Zhang, Bingxue
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 26 May 2025 through 29 May 2025
ER -