Abstract
Large Language Models (LLMs) often struggle with the Lost in the Middle phenomenon in long-context question answering (Q&A). Existing solutions, such as modifying attention mechanisms or positional encodings, typically require retraining, which demands substantial computational resources. Other strategies, including long-term memory mechanisms and context processing, heavily rely on auxiliary components and fail to fundamentally enhance the LLM’s reasoning capabilities. To bridge this gap, this paper proposes a novel collaborative reasoning framework. Initially, the framework uses a retrieval-augmented generation (RAG) approach to generate a candidate answer from sentences relevant to the input question. Subsequently, a training-free Shadow-LLM is designed to supplement local sentence-level information from the long-context during the reasoning process to produce another candidate answer. Finally, a one-out-of-two selection strategy chooses the final answer based on the two candidates. Experiments on three long-context Q&A datasets and three backbone LLMs show that our method raises the F1 score over the baselines by 2% to 18%. Notably, we find that activating only the 0th decoder layer of the LLM is sufficient for Shadow-LLM to operate at optimal performance, enabling efficient deployment without retraining. The code is available atlink.
| Original language | English |
|---|---|
| Article number | 129960 |
| Journal | Expert Systems with Applications |
| Volume | 299 |
| DOIs | |
| State | Published - 1 Mar 2026 |
Keywords
- Large language model
- Long-context processing
- Lost in the middle
- Retrieval-augmented generation
Fingerprint
Dive into the research topics of 'A collaborative reasoning framework for large language models in long-context Q&A'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver