Safe Controller Synthesis for Nonlinear Systems via Reinforcement Learning and PAC Approximation

  • Xia Zeng
  • , Banglong Liu
  • , Zhenbing Zeng
  • , Zhiming Liu
  • , Zhengfeng Yang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Controller synthesis for nonlinear systems is an important research issue. Deep Neural Network (DNN) control policies obtained through reinforcement learning (RL), though exhibiting good performance in simulations, cannot be applied to safety-critical systems for lack of formal guarantee. To address this, this paper considers fully utilizing the advantages of RL for complex control tasks to obtain a well-performing DNN controller. Then, using PAC (Probably Approximately Correct) techniques, a polynomial surrogate controller with probabilistically controllable approximation error is obtained. Finally, the safety of the control system under the designed polynomial controller is verified using barrier certificate generation. Experiments demonstrate the effectiveness of our method in generating controllers with safety guarantees for systems with high dimensions and degrees.

Original languageEnglish
Title of host publicationProceedings of the 61st ACM/IEEE Design Automation Conference, DAC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798400706011
DOIs
StatePublished - 7 Nov 2024
Event61st ACM/IEEE Design Automation Conference, DAC 2024 - San Francisco, United States
Duration: 23 Jun 202427 Jun 2024

Publication series

NameProceedings - Design Automation Conference
ISSN (Print)0738-100X

Conference

Conference61st ACM/IEEE Design Automation Conference, DAC 2024
Country/TerritoryUnited States
CitySan Francisco
Period23/06/2427/06/24

Keywords

  • Barrier certificate
  • Controller synthesis
  • Formal verification
  • Probably Approximately Correct
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Safe Controller Synthesis for Nonlinear Systems via Reinforcement Learning and PAC Approximation'. Together they form a unique fingerprint.

Cite this