CiGA: A Cross-Layer Fine-Grained Attention Correction Method for Large Language Model

Duo Li, Jing Zhao

Research output: Contribution to journalConference articlepeer-review

Abstract

Fine-grained text processing is a significant domain in Natural Language Processing (NLP), including tasks such as long-document question answering, aspect-based sentiment analysis, and document summarization. Although Large Language Models (LLMs) perform excellent on many NLP tasks, they often exhibit hallucination, such as detail loss or inaccuracies in tasks that require handling fine-grained content. This shortcoming arises because LLMs’ final layers tend to lose attention to details compared to the middle layers. Existing optimization methods for LLMs lack a focus on attention mechanisms for fine-grained information. To address this issue, we propose a novel Cross-Layer Fine-Grained Attention Correction method (CiGA). CiGA includes two correction terms that integrate detail-oriented attention from middle layers into the final layers. Experimental results demonstrate that CiGA significantly improves LLMs’ performance on fine-grained text processing tasks.

Keywords

  • Attention Weights
  • Causal Effect
  • Fine-grained Information
  • Large Language Models
  • Natural Language Processing

Fingerprint

Dive into the research topics of 'CiGA: A Cross-Layer Fine-Grained Attention Correction Method for Large Language Model'. Together they form a unique fingerprint.

Cite this