TY - JOUR
T1 - Safe Reinforcement Learning for NN-controlled Systems with Neural Barrier Certificate Guidance
AU - Zhao, Hanrui
AU - Ren, Mengxin
AU - Liu, Banglong
AU - Qi, Niuniu
AU - Zeng, Xia
AU - Zeng, Zhenbing
AU - Yang, Zhengfeng
N1 - Publisher Copyright:
© 1982-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Safe controller synthesis is crucial for safety-critical applications. This paper presents a novel reinforcement learning approach to synthesize safe controllers for NN-controlled systems. The core idea leverages an iterative scheme that combines controller learning with neural barrier certificate (BC) verification, ultimately producing a provably safe deep neural network (DNN) controller with formal safety guarantees. The process begins by pre-training a well-performing DNN controller as an “oracle” via deep reinforcement learning (DRL). To formally verify the safety properties of the closed-loop system under the base controller, we devise a formal verification procedure that approximates the DNN controller using polynomial inclusion, followed by synthesizing neural BCs via sum-of-squares (SOS) relaxation. In cases where the base controller is insufficient to yield a real BC, the current spurious BC is incorporated as an additional penalty term to reshape the RL reward function, guiding the iterative refinement for new controllers. We implement an automated tool, NBCRL, and experimental results demonstrate the benefits of our method in terms of efficiency and scalability even for a nonlinear system with dimension up to 12.
AB - Safe controller synthesis is crucial for safety-critical applications. This paper presents a novel reinforcement learning approach to synthesize safe controllers for NN-controlled systems. The core idea leverages an iterative scheme that combines controller learning with neural barrier certificate (BC) verification, ultimately producing a provably safe deep neural network (DNN) controller with formal safety guarantees. The process begins by pre-training a well-performing DNN controller as an “oracle” via deep reinforcement learning (DRL). To formally verify the safety properties of the closed-loop system under the base controller, we devise a formal verification procedure that approximates the DNN controller using polynomial inclusion, followed by synthesizing neural BCs via sum-of-squares (SOS) relaxation. In cases where the base controller is insufficient to yield a real BC, the current spurious BC is incorporated as an additional penalty term to reshape the RL reward function, guiding the iterative refinement for new controllers. We implement an automated tool, NBCRL, and experimental results demonstrate the benefits of our method in terms of efficiency and scalability even for a nonlinear system with dimension up to 12.
KW - Continuous Dynamical Systems
KW - Counterexample Guidance
KW - Formal Verification
KW - Neural Barrier Certificate
KW - Safe Reinforcement Learning
UR - https://www.scopus.com/pages/publications/105017886367
U2 - 10.1109/TCAD.2025.3616856
DO - 10.1109/TCAD.2025.3616856
M3 - 文章
AN - SCOPUS:105017886367
SN - 0278-0070
JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ER -