TY - JOUR
T1 - An Active Authorization Control Method for Deep Reinforcement Learning Model Based on GANs and Adaptive Trigger
AU - Xue, Mingfu
AU - Chen, Kewei
AU - Zhang, Leo Yu
AU - Zhang, Yushu
AU - Liu, Weiqiang
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In recent years, deep reinforcement learning (DRL) has found widespread applications across diverse scenarios. Since the DRL training process requires substantial time and financial costs, well-trained DRL policies should be considered as intellectual property (IP) which deserves proper protection. However, to date, there are only a few studies on IP protection on DRL and the existing methods are limited to passive copyright verification. In this paper, we propose the first active authorization control method for DRL which can proactively protect deep reinforcement learning policy. The DRL policy trained with this method can be used by authorized users normally, but cannot be used by unauthorized users (i.e., the protected policy’s performance for unauthorized users is paralyzed). Specifically, we train a trigger injection network and a discriminator network based on generative adversarial networks (GANs). During the DRL policy training phase, we use trigger injection network to insert sample-specific triggers to all observations and use triggered observations to train the protected policy. Our approach is applicable across various deep reinforcement learning algorithms. We conduct effectiveness experiments on different DRL policies trained using different DRL algorithms, and the experimental results revealed that the performance of authorized users is on par with the performance of clean DRL policy trained normally (baseline), whereas the performance of unauthorized users significantly deviates from that of the baseline. Specifically, the authorized performance of protected Breakout-DQN, Breakout-A2C, MsPacman-DQN and MsPacman-A2C policies are 416.4 (baseline 397.8), 403.0 (baseline 415.0), 2552.0 (baseline 2472.0), and 1964.0 (baseline 1828.0). Comparatively, the unauthorized performance of protected Breakout-DQN, Breakout-A2C, MsPacman-DQN and MsPacman-A2C policies are only 4.4 (baseline 397.8), 2.0 (baseline 415.0), 74.0 (baseline 2472.0), and 514.0 (baseline 1828.0). Furthermore, the experiments demonstrate that the proposed method exhibits robustness against pruning, fine-tuning, and adaptive attacks.
AB - In recent years, deep reinforcement learning (DRL) has found widespread applications across diverse scenarios. Since the DRL training process requires substantial time and financial costs, well-trained DRL policies should be considered as intellectual property (IP) which deserves proper protection. However, to date, there are only a few studies on IP protection on DRL and the existing methods are limited to passive copyright verification. In this paper, we propose the first active authorization control method for DRL which can proactively protect deep reinforcement learning policy. The DRL policy trained with this method can be used by authorized users normally, but cannot be used by unauthorized users (i.e., the protected policy’s performance for unauthorized users is paralyzed). Specifically, we train a trigger injection network and a discriminator network based on generative adversarial networks (GANs). During the DRL policy training phase, we use trigger injection network to insert sample-specific triggers to all observations and use triggered observations to train the protected policy. Our approach is applicable across various deep reinforcement learning algorithms. We conduct effectiveness experiments on different DRL policies trained using different DRL algorithms, and the experimental results revealed that the performance of authorized users is on par with the performance of clean DRL policy trained normally (baseline), whereas the performance of unauthorized users significantly deviates from that of the baseline. Specifically, the authorized performance of protected Breakout-DQN, Breakout-A2C, MsPacman-DQN and MsPacman-A2C policies are 416.4 (baseline 397.8), 403.0 (baseline 415.0), 2552.0 (baseline 2472.0), and 1964.0 (baseline 1828.0). Comparatively, the unauthorized performance of protected Breakout-DQN, Breakout-A2C, MsPacman-DQN and MsPacman-A2C policies are only 4.4 (baseline 397.8), 2.0 (baseline 415.0), 74.0 (baseline 2472.0), and 514.0 (baseline 1828.0). Furthermore, the experiments demonstrate that the proposed method exhibits robustness against pruning, fine-tuning, and adaptive attacks.
KW - Trustworthy artificial intelligence
KW - active authorization control
KW - backdoor attack
KW - deep reinforcement learning
KW - intellectual property protection
UR - https://www.scopus.com/pages/publications/105004893070
U2 - 10.1109/TIFS.2025.3567915
DO - 10.1109/TIFS.2025.3567915
M3 - 文章
AN - SCOPUS:105004893070
SN - 1556-6013
VL - 20
SP - 5789
EP - 5801
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -