LST2A: Lexical-Syntactic Targeted Adversarial Attack for Texts

Guanghao Zhou, Panjia Qiu, Mingyuan Fan, Cen Chen, Yaliang Li, Wenmeng Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Textual adversarial attack in black-box scenarios is a challenging task, as only the predicted label is available, and the text space is discrete and non-differentiable. Current research in this area is still in its infancy and mostly focuses on untargeted attack, lacking the capability to control the labels of the generated adversarial examples. Meanwhile, existing textual adversarial attack methods primarily rely on word substitution operations to maintain semantic similarity between the adversarial and original examples, which greatly limits the search space for adversarial examples. To address these issues, we propose a novel <u>L</u>exical-<u>S</u>yntactic <u>T</u>argeted <u>A</u>dversarial <u>A</u>ttack method tailored for the black-box settings, referred to as LST2A. Our approach involves adversarial perturbations at different levels of granularities, i.e., word-level with word substitution operations and syntactic-level through rewriting the syntax of the examples. Specifically, we first embed the entire text into the embedding layer of a masked language model, and then optimize perturbations at the word level within the hidden state to generate adversarial examples with the target label. For examples that are difficult to attack successfully with only word-level perturbations at higher semantic similarity thresholds, we leverage Large Language Model (LLM) to introduce syntactic-level perturbations to these examples, making them more vulnerable to the decision boundary of the victim model. Subsequently, we re-optimize the word-level perturbations for these vulnerable examples. Extensive experiments and human evaluations demonstrate that our proposed method consistently outperforms the state-of-the-art baselines, crafting smoother, more grammatically correct adversarial examples.

Original languageEnglish
Title of host publicationCIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages3463-3473
Number of pages11
ISBN (Electronic)9798400704369
DOIs
StatePublished - 21 Oct 2024
Event33rd ACM International Conference on Information and Knowledge Management, CIKM 2024 - Boise, United States
Duration: 21 Oct 202425 Oct 2024

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
ISSN (Print)2155-0751

Conference

Conference33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
Country/TerritoryUnited States
CityBoise
Period21/10/2425/10/24

Keywords

  • adversarial examples
  • black-box scenario
  • deep neural networks
  • gradient-based optimization
  • security
  • targeted attack
  • textual adversarial attack

Fingerprint

Dive into the research topics of 'LST2A: Lexical-Syntactic Targeted Adversarial Attack for Texts'. Together they form a unique fingerprint.

Cite this