TaiChi: Improving the Robustness of NLP Models by Seeking Common Ground While Reserving Differences

Huimin Chen, Chengyu Wang, Yanhao Wang, Cen Chen, Yinggui Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Recent studies have shown that Pre-trained Language Models (PLMs) are vulnerable to adversarial examples, crafted by introducing human-imperceptible perturbations to clean examples to deceive the models. This vulnerability stems from the divergence in the data distributions of clean and adversarial examples. Therefore, addressing this issue involves teaching the model to diminish the differences between the two types of samples and to focus more on their similarities. To this end, we propose a novel approach named TaiChi that employs a Siamese network architecture. Specifically, it consists of two sub-networks sharing the same structure but trained on clean and adversarial samples, respectively, and uses a contrastive learning strategy to encourage the generation of similar language representations for both kinds of samples. Furthermore, it utilizes the Kullback-Leibler (KL) divergence loss to enhance the consistency in the predictive behavior of the two sub-networks. Extensive experiments across three widely used datasets demonstrate that TaiChi achieves superior trade-offs between robustness to adversarial attacks at token and character levels and accuracy on clean examples compared to previous defense methods. Our code and data are publicly available at https://github.com/sai4july/TaiChi.

Original languageEnglish
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
PublisherEuropean Language Resources Association (ELRA)
Pages15542-15551
Number of pages10
ISBN (Electronic)9782493814104
StatePublished - 2024
EventJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy
Duration: 20 May 202425 May 2024

Publication series

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conference

ConferenceJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period20/05/2425/05/24

Keywords

  • model robustness
  • neural language representation learning
  • text analytics
  • text classification

Fingerprint

Dive into the research topics of 'TaiChi: Improving the Robustness of NLP Models by Seeking Common Ground While Reserving Differences'. Together they form a unique fingerprint.

Cite this