TY - JOUR
T1 - A collaborative reasoning framework for large language models in long-context Q&A
AU - Yao, Jiacheng
AU - He, Guoxiu
AU - Xu, Xin
N1 - Publisher Copyright:
© 2025 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
PY - 2025
Y1 - 2025
N2 - Large Language Models (LLMs) often struggle with the Lost in the Middle phenomenon in long-context question answering (Q&A). Existing solutions, such as modifying attention mechanisms or positional encodings, typically require retraining, which demands substantial computational resources. Other strategies, including long-term memory mechanisms and context processing, heavily rely on auxiliary components and fail to fundamentally enhance the LLM’s reasoning capabilities. To bridge this gap, this paper proposes a novel collaborative reasoning framework. Initially, the framework uses a retrieval-augmented generation (RAG) approach to generate a candidate answer from sentences relevant to the input question. Subsequently, a training-free Shadow-LLM is designed to supplement local sentence-level information from the long-context during the reasoning process to produce another candidate answer. Finally, a one-out-of-two selection strategy chooses the final answer based on the two candidates. Experiments on three long-context Q&A datasets and three backbone LLMs show that our method raises the F1 score over the baselines by 2% to 18%. Notably, we find that activating only the 0th decoder layer of the LLM is sufficient for Shadow-LLM to operate at optimal performance, enabling efficient deployment without retraining. The code is available atlink.
AB - Large Language Models (LLMs) often struggle with the Lost in the Middle phenomenon in long-context question answering (Q&A). Existing solutions, such as modifying attention mechanisms or positional encodings, typically require retraining, which demands substantial computational resources. Other strategies, including long-term memory mechanisms and context processing, heavily rely on auxiliary components and fail to fundamentally enhance the LLM’s reasoning capabilities. To bridge this gap, this paper proposes a novel collaborative reasoning framework. Initially, the framework uses a retrieval-augmented generation (RAG) approach to generate a candidate answer from sentences relevant to the input question. Subsequently, a training-free Shadow-LLM is designed to supplement local sentence-level information from the long-context during the reasoning process to produce another candidate answer. Finally, a one-out-of-two selection strategy chooses the final answer based on the two candidates. Experiments on three long-context Q&A datasets and three backbone LLMs show that our method raises the F1 score over the baselines by 2% to 18%. Notably, we find that activating only the 0th decoder layer of the LLM is sufficient for Shadow-LLM to operate at optimal performance, enabling efficient deployment without retraining. The code is available atlink.
KW - Large language model
KW - Long-context processing
KW - Lost in the middle
KW - Retrieval-augmented generation
UR - https://www.scopus.com/pages/publications/105020589354
U2 - 10.1016/j.eswa.2025.129960
DO - 10.1016/j.eswa.2025.129960
M3 - 文章
AN - SCOPUS:105020589354
SN - 0957-4174
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 129960
ER -