IsBad: Imperceptible Sample-Specific Backdoor to DNN with Denoising Autoencoder

Xiangqi Wang, Mingfu Xue*, Kewei Chen, Jing Xu, Wenmao Liu, Leo Yu Zhang, Yushu Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The backdoor attack poses a new security threat to deep neural networks (DNNs). The existing backdoor often relies on visible universal triggers to make the backdoored model malfunction, which is not only usually visually suspicious to humans but also catchable by mainstream countermeasures. We propose an imperceptible sample-specific backdoor that the trigger varies from sample to sample and is invisible. Our trigger generation is automated through a denoising autoencoder that is fed with delicate but pervasive features (e.g., edge patterns per image). We extensively experiment with our backdoor attack on ImageNet and MS-Celeb-1M, which demonstrates a stable and nearly 100% (i.e., 99.8%) attack success rate with negligible impact on the clean data accuracy of the infected model. The denoising autoencoder-based trigger generator is reusable or transferable across tasks (e.g., from ImageNet to MS-Celeb-1M), whilst the trigger has high exclusiveness (i.e., a trigger generated for one sample is not applicable to another sample). Besides, our proposal backdoored model has achieved high evasiveness against mainstream backdoor defenses such as Neural Cleanse, STRIP, SentiNet, Fine-Pruning, BDMAE, and BTI-DBF.

Original languageEnglish
JournalIEEE Transactions on Emerging Topics in Computing
DOIs
StateAccepted/In press - 2025

Keywords

  • artificial intelligence security
  • backdoor attack
  • data poisoning
  • imperceptible trigger

Fingerprint

Dive into the research topics of 'IsBad: Imperceptible Sample-Specific Backdoor to DNN with Denoising Autoencoder'. Together they form a unique fingerprint.

Cite this