跳到主要导航 跳到搜索 跳到主要内容

LST2A: Lexical-Syntactic Targeted Adversarial Attack for Texts

  • Guanghao Zhou
  • , Panjia Qiu
  • , Mingyuan Fan
  • , Cen Chen*
  • , Yaliang Li
  • , Wenmeng Zhou
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Textual adversarial attack in black-box scenarios is a challenging task, as only the predicted label is available, and the text space is discrete and non-differentiable. Current research in this area is still in its infancy and mostly focuses on untargeted attack, lacking the capability to control the labels of the generated adversarial examples. Meanwhile, existing textual adversarial attack methods primarily rely on word substitution operations to maintain semantic similarity between the adversarial and original examples, which greatly limits the search space for adversarial examples. To address these issues, we propose a novel <u>L</u>exical-<u>S</u>yntactic <u>T</u>argeted <u>A</u>dversarial <u>A</u>ttack method tailored for the black-box settings, referred to as LST2A. Our approach involves adversarial perturbations at different levels of granularities, i.e., word-level with word substitution operations and syntactic-level through rewriting the syntax of the examples. Specifically, we first embed the entire text into the embedding layer of a masked language model, and then optimize perturbations at the word level within the hidden state to generate adversarial examples with the target label. For examples that are difficult to attack successfully with only word-level perturbations at higher semantic similarity thresholds, we leverage Large Language Model (LLM) to introduce syntactic-level perturbations to these examples, making them more vulnerable to the decision boundary of the victim model. Subsequently, we re-optimize the word-level perturbations for these vulnerable examples. Extensive experiments and human evaluations demonstrate that our proposed method consistently outperforms the state-of-the-art baselines, crafting smoother, more grammatically correct adversarial examples.

源语言英语
主期刊名CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
出版商Association for Computing Machinery
3463-3473
页数11
ISBN(电子版)9798400704369
DOI
出版状态已出版 - 21 10月 2024
活动33rd ACM International Conference on Information and Knowledge Management, CIKM 2024 - Boise, 美国
期限: 21 10月 202425 10月 2024

出版系列

姓名International Conference on Information and Knowledge Management, Proceedings
ISSN(印刷版)2155-0751

会议

会议33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
国家/地区美国
Boise
时期21/10/2425/10/24

指纹

探究 'LST2A: Lexical-Syntactic Targeted Adversarial Attack for Texts' 的科研主题。它们共同构成独一无二的指纹。

引用此