跳到主要导航 跳到搜索 跳到主要内容

HALLUSHIFT: Measuring Distribution Shifts towards Hallucination Detection in LLMs

  • Sharanya Dasgupta*
  • , Sujoy Nath
  • , Arkaprabha Basu
  • , Pourya Shamsolmoali
  • , Swagatam Das
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Large Language Models (LLMs) have recently garnered widespread attention due to their adeptness at generating innovative responses to the given prompts across a multitude of domains. However, LLMs often suffer from the inherent limitation of hallucinations and generate incorrect information while maintaining well-structured and coherent responses. In this work, we hypothesize that hallucinations stem from the internal dynamics of LLMs. Our observations indicate that, during passage generation, LLMs tend to deviate from factual accuracy in subtle parts of responses, eventually shifting toward misinformation. This phenomenon bears a resemblance to human cognition, where individuals may hallucinate while maintaining logical coherence, embedding uncertainty within minor segments of their speech. To investigate this further, we introduce an innovative approach, HALLUSHIFT, designed to analyze the distribution shifts in the internal state space and token probabilities of the LLM-generated responses. Our method attains superior performances compared to existing baselines across various benchmark datasets. Our codebase is available at https://github.com/sharanya-dasgupta001/hallushift.

源语言英语
主期刊名International Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9798331510428
DOI
出版状态已出版 - 2025
活动2025 International Joint Conference on Neural Networks, IJCNN 2025 - Rome, 意大利
期限: 30 6月 20255 7月 2025

出版系列

姓名Proceedings of the International Joint Conference on Neural Networks
ISSN(印刷版)2161-4393
ISSN(电子版)2161-4407

会议

会议2025 International Joint Conference on Neural Networks, IJCNN 2025
国家/地区意大利
Rome
时期30/06/255/07/25

指纹

探究 'HALLUSHIFT: Measuring Distribution Shifts towards Hallucination Detection in LLMs' 的科研主题。它们共同构成独一无二的指纹。

引用此