跳到主要导航 跳到搜索 跳到主要内容

Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization

  • Yihao Huang
  • , Chong Wang
  • , Xiaojun Jia*
  • , Qing Guo
  • , Felix Juefei-Xu
  • , Jian Zhang
  • , Yang Liu
  • , Geguang Pu
  • *此作品的通讯作者
  • Nanyang Technological University
  • CFAR
  • New York University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Universal goal hijacking is a kind of prompt injection attack that forces LLMs to return a target malicious response for arbitrary normal user prompts. The previous methods achieve high attack performance while being too cumbersome and time-consuming. Also, they have concentrated solely on optimization algorithms, overlooking the crucial role of the prompt. To this end, we propose a method called POUGH that incorporates an efficient optimization algorithm and two semantics-guided prompt organization strategies. Specifically, our method starts with a sampling strategy to select representative prompts from a candidate pool, followed by a ranking strategy that prioritizes them. Given the sequentially ranked prompts, our method employs an iterative optimization algorithm to generate a fixed suffix that can concatenate to arbitrary user prompts for universal goal hijacking. Experiments conducted on four popular LLMs and ten types of target responses verified the effectiveness. Warning: This paper contains model outputs that are offensive in nature.

源语言英语
主期刊名Long Papers
编辑Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
出版商Association for Computational Linguistics (ACL)
5796-5816
页数21
ISBN(电子版)9798891762510
DOI
出版状态已出版 - 2025
活动63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, 奥地利
期限: 27 7月 20251 8月 2025

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
1
ISSN(印刷版)0736-587X

会议

会议63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
国家/地区奥地利
Vienna
时期27/07/251/08/25

指纹

探究 'Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization' 的科研主题。它们共同构成独一无二的指纹。

引用此