TY - GEN
T1 - LST2A
T2 - 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
AU - Zhou, Guanghao
AU - Qiu, Panjia
AU - Fan, Mingyuan
AU - Chen, Cen
AU - Li, Yaliang
AU - Zhou, Wenmeng
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/21
Y1 - 2024/10/21
N2 - Textual adversarial attack in black-box scenarios is a challenging task, as only the predicted label is available, and the text space is discrete and non-differentiable. Current research in this area is still in its infancy and mostly focuses on untargeted attack, lacking the capability to control the labels of the generated adversarial examples. Meanwhile, existing textual adversarial attack methods primarily rely on word substitution operations to maintain semantic similarity between the adversarial and original examples, which greatly limits the search space for adversarial examples. To address these issues, we propose a novel Lexical-Syntactic Targeted Adversarial Attack method tailored for the black-box settings, referred to as LST2A. Our approach involves adversarial perturbations at different levels of granularities, i.e., word-level with word substitution operations and syntactic-level through rewriting the syntax of the examples. Specifically, we first embed the entire text into the embedding layer of a masked language model, and then optimize perturbations at the word level within the hidden state to generate adversarial examples with the target label. For examples that are difficult to attack successfully with only word-level perturbations at higher semantic similarity thresholds, we leverage Large Language Model (LLM) to introduce syntactic-level perturbations to these examples, making them more vulnerable to the decision boundary of the victim model. Subsequently, we re-optimize the word-level perturbations for these vulnerable examples. Extensive experiments and human evaluations demonstrate that our proposed method consistently outperforms the state-of-the-art baselines, crafting smoother, more grammatically correct adversarial examples.
AB - Textual adversarial attack in black-box scenarios is a challenging task, as only the predicted label is available, and the text space is discrete and non-differentiable. Current research in this area is still in its infancy and mostly focuses on untargeted attack, lacking the capability to control the labels of the generated adversarial examples. Meanwhile, existing textual adversarial attack methods primarily rely on word substitution operations to maintain semantic similarity between the adversarial and original examples, which greatly limits the search space for adversarial examples. To address these issues, we propose a novel Lexical-Syntactic Targeted Adversarial Attack method tailored for the black-box settings, referred to as LST2A. Our approach involves adversarial perturbations at different levels of granularities, i.e., word-level with word substitution operations and syntactic-level through rewriting the syntax of the examples. Specifically, we first embed the entire text into the embedding layer of a masked language model, and then optimize perturbations at the word level within the hidden state to generate adversarial examples with the target label. For examples that are difficult to attack successfully with only word-level perturbations at higher semantic similarity thresholds, we leverage Large Language Model (LLM) to introduce syntactic-level perturbations to these examples, making them more vulnerable to the decision boundary of the victim model. Subsequently, we re-optimize the word-level perturbations for these vulnerable examples. Extensive experiments and human evaluations demonstrate that our proposed method consistently outperforms the state-of-the-art baselines, crafting smoother, more grammatically correct adversarial examples.
KW - adversarial examples
KW - black-box scenario
KW - deep neural networks
KW - gradient-based optimization
KW - security
KW - targeted attack
KW - textual adversarial attack
UR - https://www.scopus.com/pages/publications/85210001370
U2 - 10.1145/3627673.3679640
DO - 10.1145/3627673.3679640
M3 - 会议稿件
AN - SCOPUS:85210001370
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3463
EP - 3473
BT - CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 21 October 2024 through 25 October 2024
ER -