TY - JOUR
T1 - IsBad
T2 - Imperceptible Sample-Specific Backdoor to DNN with Denoising Autoencoder
AU - Wang, Xiangqi
AU - Xue, Mingfu
AU - Chen, Kewei
AU - Xu, Jing
AU - Liu, Wenmao
AU - Zhang, Leo Yu
AU - Zhang, Yushu
N1 - Publisher Copyright:
© 2025 IEEE. All rights reserved.
PY - 2025
Y1 - 2025
N2 - The backdoor attack poses a new security threat to deep neural networks (DNNs). The existing backdoor often relies on visible universal triggers to make the backdoored model malfunction, which is not only usually visually suspicious to humans but also catchable by mainstream countermeasures. We propose an imperceptible sample-specific backdoor that the trigger varies from sample to sample and is invisible. Our trigger generation is automated through a denoising autoencoder that is fed with delicate but pervasive features (e.g., edge patterns per image). We extensively experiment with our backdoor attack on ImageNet and MS-Celeb-1M, which demonstrates a stable and nearly 100% (i.e., 99.8%) attack success rate with negligible impact on the clean data accuracy of the infected model. The denoising autoencoder-based trigger generator is reusable or transferable across tasks (e.g., from ImageNet to MS-Celeb-1M), whilst the trigger has high exclusiveness (i.e., a trigger generated for one sample is not applicable to another sample). Besides, our proposal backdoored model has achieved high evasiveness against mainstream backdoor defenses such as Neural Cleanse, STRIP, SentiNet, Fine-Pruning, BDMAE, and BTI-DBF.
AB - The backdoor attack poses a new security threat to deep neural networks (DNNs). The existing backdoor often relies on visible universal triggers to make the backdoored model malfunction, which is not only usually visually suspicious to humans but also catchable by mainstream countermeasures. We propose an imperceptible sample-specific backdoor that the trigger varies from sample to sample and is invisible. Our trigger generation is automated through a denoising autoencoder that is fed with delicate but pervasive features (e.g., edge patterns per image). We extensively experiment with our backdoor attack on ImageNet and MS-Celeb-1M, which demonstrates a stable and nearly 100% (i.e., 99.8%) attack success rate with negligible impact on the clean data accuracy of the infected model. The denoising autoencoder-based trigger generator is reusable or transferable across tasks (e.g., from ImageNet to MS-Celeb-1M), whilst the trigger has high exclusiveness (i.e., a trigger generated for one sample is not applicable to another sample). Besides, our proposal backdoored model has achieved high evasiveness against mainstream backdoor defenses such as Neural Cleanse, STRIP, SentiNet, Fine-Pruning, BDMAE, and BTI-DBF.
KW - artificial intelligence security
KW - backdoor attack
KW - data poisoning
KW - imperceptible trigger
UR - https://www.scopus.com/pages/publications/105022306181
U2 - 10.1109/TETC.2025.3631134
DO - 10.1109/TETC.2025.3631134
M3 - 文章
AN - SCOPUS:105022306181
SN - 2168-6750
JO - IEEE Transactions on Emerging Topics in Computing
JF - IEEE Transactions on Emerging Topics in Computing
ER -