TY - JOUR
T1 - TreeQA
T2 - Enhanced LLM-RAG with logic tree reasoning for reliable and interpretable multi-hop question answering
AU - Zhang, Xiangrui
AU - Zhao, Fuyong
AU - Liu, Yutian
AU - Chen, Panfeng
AU - Wang, Yanhao
AU - Wang, Xiaohua
AU - Ma, Dan
AU - Xu, Huarong
AU - Chen, Mei
AU - Li, Hui
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/11/25
Y1 - 2025/11/25
N2 - Multi-Hop Question Answering (MHQA), crucial for complex information retrieval, remains challenging for current Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, which often suffer from hallucination, reliance on incomplete knowledge, and opaque reasoning processes. Existing RAG methods, while beneficial, still struggle with the intricacies of multi-step inference and ensuring verifiable accuracy. This research introduces TreeQA, a novel framework designed to significantly enhance the reliability and interpretability of LLM-RAG systems in MHQA tasks. TreeQA addresses these limitations by decomposing complex multi-hop questions into a hierarchical logic tree of simpler, verifiable sub-questions, integrating evidence from both structured knowledge bases (e.g., Wikidata) and unstructured text (e.g., Wikipedia), and employing an iterative, evidence-based validation and self-correction mechanism at each reasoning step to dynamically rectify errors and prevent their accumulation. Extensive experiments on four benchmark datasets (WebQSP, QALD-en, AdvHotpotQA, and 2WikiMultiHopQA) demonstrate TreeQA's superior performance, achieving Hit@1 scores of 87 %, 57 %, 53 %, and 59 %, respectively, representing improvements of 4 %-12 % over state-of-the-art LLM-RAG methods. These findings highlight the significant impact of structured, verifiable reasoning pathways in developing more robust, accurate, and interpretable knowledge-intensive AI systems, thereby enhancing the practical utility of LLMs in complex reasoning scenarios. Our code is publicly available at https://github.com/ACMISLab/TreeQA.
AB - Multi-Hop Question Answering (MHQA), crucial for complex information retrieval, remains challenging for current Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, which often suffer from hallucination, reliance on incomplete knowledge, and opaque reasoning processes. Existing RAG methods, while beneficial, still struggle with the intricacies of multi-step inference and ensuring verifiable accuracy. This research introduces TreeQA, a novel framework designed to significantly enhance the reliability and interpretability of LLM-RAG systems in MHQA tasks. TreeQA addresses these limitations by decomposing complex multi-hop questions into a hierarchical logic tree of simpler, verifiable sub-questions, integrating evidence from both structured knowledge bases (e.g., Wikidata) and unstructured text (e.g., Wikipedia), and employing an iterative, evidence-based validation and self-correction mechanism at each reasoning step to dynamically rectify errors and prevent their accumulation. Extensive experiments on four benchmark datasets (WebQSP, QALD-en, AdvHotpotQA, and 2WikiMultiHopQA) demonstrate TreeQA's superior performance, achieving Hit@1 scores of 87 %, 57 %, 53 %, and 59 %, respectively, representing improvements of 4 %-12 % over state-of-the-art LLM-RAG methods. These findings highlight the significant impact of structured, verifiable reasoning pathways in developing more robust, accurate, and interpretable knowledge-intensive AI systems, thereby enhancing the practical utility of LLMs in complex reasoning scenarios. Our code is publicly available at https://github.com/ACMISLab/TreeQA.
KW - Large language model
KW - Logic tree
KW - Multi-hop question answering
KW - Retrieval-augmented generation
UR - https://www.scopus.com/pages/publications/105017774287
U2 - 10.1016/j.knosys.2025.114526
DO - 10.1016/j.knosys.2025.114526
M3 - 文章
AN - SCOPUS:105017774287
SN - 0950-7051
VL - 330
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 114526
ER -