TY - JOUR
T1 - An Explainable Intellectual Property Protection Method for Deep Neural Networks Based on Intrinsic Features
AU - Xue, Mingfu
AU - Wang, Xin
AU - Wu, Yinghao
AU - Ni, Shifeng
AU - Zhang, Leo Yu
AU - Zhang, Yushu
AU - Liu, Weiqiang
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2024
Y1 - 2024
N2 - Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.
AB - Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.
KW - Deep neural network (DNN)
KW - deep Taylor decomposition (DTD)
KW - fingerprint
KW - intellectual property
KW - intrinsic feature
UR - https://www.scopus.com/pages/publications/85190723495
U2 - 10.1109/TAI.2024.3388389
DO - 10.1109/TAI.2024.3388389
M3 - 文章
AN - SCOPUS:85190723495
SN - 2691-4581
VL - 5
SP - 4649
EP - 4659
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 9
ER -