TY - GEN
T1 - Class-Balanced Universal Perturbations for Adversarial Training
AU - Ma, Kexue
AU - Cao, Guitao
AU - Xu, Mengqian
AU - Wu, Chunwei
AU - Wang, Hong
AU - Cao, Wenming
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Universal attack generates image-agnostic perturbation called universal adversarial perturbation (UAP), which can be added to all samples in the data distribution to fool the classifier. However, a universal perturbation will likely mislead the classifier to identify most adversarial examples as the same label, resulting in the imbalance of attack strength between classes. In this paper, we propose class-balanced UAPs that enlarge the dispersion of the predicted labels for adversarial examples. To ensure attack strength and balance simultaneously, we design a novel diversity objective containing probability calibration and penalty regularizer, which fully considers the predicted label distribution between samples and the predicted probability distribution within samples. Furthermore, we apply class-balanced attacks in adversarial training to defend against universal perturbations since the class-balanced UAP provides diverse perturbation directions. We correspondingly reformulate adversarial training from the min-max optimization problem into a new two-stage framework. Experiments on several benchmark datasets demonstrate that the class-balanced attack achieves better performance than the universal attack, while adversarial training with class-balanced UAP achieves state-of-the-art results in clean accuracy and robustness to universal perturbations.
AB - Universal attack generates image-agnostic perturbation called universal adversarial perturbation (UAP), which can be added to all samples in the data distribution to fool the classifier. However, a universal perturbation will likely mislead the classifier to identify most adversarial examples as the same label, resulting in the imbalance of attack strength between classes. In this paper, we propose class-balanced UAPs that enlarge the dispersion of the predicted labels for adversarial examples. To ensure attack strength and balance simultaneously, we design a novel diversity objective containing probability calibration and penalty regularizer, which fully considers the predicted label distribution between samples and the predicted probability distribution within samples. Furthermore, we apply class-balanced attacks in adversarial training to defend against universal perturbations since the class-balanced UAP provides diverse perturbation directions. We correspondingly reformulate adversarial training from the min-max optimization problem into a new two-stage framework. Experiments on several benchmark datasets demonstrate that the class-balanced attack achieves better performance than the universal attack, while adversarial training with class-balanced UAP achieves state-of-the-art results in clean accuracy and robustness to universal perturbations.
UR - https://www.scopus.com/pages/publications/85169583413
U2 - 10.1109/IJCNN54540.2023.10191447
DO - 10.1109/IJCNN54540.2023.10191447
M3 - 会议稿件
AN - SCOPUS:85169583413
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - IJCNN 2023 - International Joint Conference on Neural Networks, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Joint Conference on Neural Networks, IJCNN 2023
Y2 - 18 June 2023 through 23 June 2023
ER -