TreeQA: Enhanced LLM-RAG with logic tree reasoning for reliable and interpretable multi-hop question answering

  • Xiangrui Zhang
  • , Fuyong Zhao
  • , Yutian Liu
  • , Panfeng Chen
  • , Yanhao Wang
  • , Xiaohua Wang
  • , Dan Ma
  • , Huarong Xu
  • , Mei Chen
  • , Hui Li*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-Hop Question Answering (MHQA), crucial for complex information retrieval, remains challenging for current Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, which often suffer from hallucination, reliance on incomplete knowledge, and opaque reasoning processes. Existing RAG methods, while beneficial, still struggle with the intricacies of multi-step inference and ensuring verifiable accuracy. This research introduces TreeQA, a novel framework designed to significantly enhance the reliability and interpretability of LLM-RAG systems in MHQA tasks. TreeQA addresses these limitations by decomposing complex multi-hop questions into a hierarchical logic tree of simpler, verifiable sub-questions, integrating evidence from both structured knowledge bases (e.g., Wikidata) and unstructured text (e.g., Wikipedia), and employing an iterative, evidence-based validation and self-correction mechanism at each reasoning step to dynamically rectify errors and prevent their accumulation. Extensive experiments on four benchmark datasets (WebQSP, QALD-en, AdvHotpotQA, and 2WikiMultiHopQA) demonstrate TreeQA's superior performance, achieving Hit@1 scores of 87 %, 57 %, 53 %, and 59 %, respectively, representing improvements of 4 %-12 % over state-of-the-art LLM-RAG methods. These findings highlight the significant impact of structured, verifiable reasoning pathways in developing more robust, accurate, and interpretable knowledge-intensive AI systems, thereby enhancing the practical utility of LLMs in complex reasoning scenarios. Our code is publicly available at https://github.com/ACMISLab/TreeQA.

Original languageEnglish
Article number114526
JournalKnowledge-Based Systems
Volume330
DOIs
StatePublished - 25 Nov 2025

Keywords

  • Large language model
  • Logic tree
  • Multi-hop question answering
  • Retrieval-augmented generation

Fingerprint

Dive into the research topics of 'TreeQA: Enhanced LLM-RAG with logic tree reasoning for reliable and interpretable multi-hop question answering'. Together they form a unique fingerprint.

Cite this