TY - JOUR
T1 - Semi-Fragile Neural Network Watermarking Based on Adversarial Examples
AU - Yuan, Zihan
AU - Zhang, Xinpeng
AU - Wang, Zichi
AU - Yin, Zhaoxia
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2024
Y1 - 2024
N2 - Deep neural networks (DNNs) may be subject to various modifications during transmission and use. Regular processing operations do not affect the functionality of a model, while malicious tampering will cause serious damage. Therefore, it is crucial to determine the availability of a DNN model. To address this issue, we propose a semi-fragile black-box watermarking method that can distinguish between accidental modification and malicious tampering of DNNs, focusing on the privacy and security of neural network models. Specifically, for a given model, a strategy is designed to generate semi-fragile and sensitive samples using adversarial example techniques without decreasing the model accuracy. The model outputs for these samples are extremely sensitive to malicious tampering and robust to accidental modification. According to these properties, accidental modification and malicious tampering can be distinguished to assess the availability of a watermarked model. Extensive experiments demonstrate that the proposed method can detect malicious model tampering with high accuracy up to 100% while tolerating accidental modifications such as fine-tuning, pruning, and quantitation with the accuracy exceed 75%. Moreover, our semi-fragile neural network watermarking approach can be easily extended to various DNNs.
AB - Deep neural networks (DNNs) may be subject to various modifications during transmission and use. Regular processing operations do not affect the functionality of a model, while malicious tampering will cause serious damage. Therefore, it is crucial to determine the availability of a DNN model. To address this issue, we propose a semi-fragile black-box watermarking method that can distinguish between accidental modification and malicious tampering of DNNs, focusing on the privacy and security of neural network models. Specifically, for a given model, a strategy is designed to generate semi-fragile and sensitive samples using adversarial example techniques without decreasing the model accuracy. The model outputs for these samples are extremely sensitive to malicious tampering and robust to accidental modification. According to these properties, accidental modification and malicious tampering can be distinguished to assess the availability of a watermarked model. Extensive experiments demonstrate that the proposed method can detect malicious model tampering with high accuracy up to 100% while tolerating accidental modifications such as fine-tuning, pruning, and quantitation with the accuracy exceed 75%. Moreover, our semi-fragile neural network watermarking approach can be easily extended to various DNNs.
KW - Semi-fragile watermarking
KW - black-box
KW - malicious tampering
KW - neural network
KW - privacy and security
UR - https://www.scopus.com/pages/publications/85188464945
U2 - 10.1109/TETCI.2024.3372373
DO - 10.1109/TETCI.2024.3372373
M3 - 文章
AN - SCOPUS:85188464945
SN - 2471-285X
VL - 8
SP - 2775
EP - 2790
JO - IEEE Transactions on Emerging Topics in Computational Intelligence
JF - IEEE Transactions on Emerging Topics in Computational Intelligence
IS - 4
ER -