An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation

  • Zhengfeng Yang
  • , Yidan Zhang
  • , Wang Lin*
  • , Xia Zeng
  • , Xiaochao Tang
  • , Zhenbing Zeng
  • , Zhiming Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

In this paper, we propose a safe reinforcement learning approach to synthesize deep neural network (DNN) controllers for nonlinear systems subject to safety constraints. The proposed approach employs an iterative scheme where a learner and a verifier interact to synthesize safe DNN controllers. The learner trains a DNN controller via deep reinforcement learning, and the verifier certifies the learned controller through computing a maximal safe initial region and its corresponding barrier certificate, based on polynomial abstraction and bilinear matrix inequalities solving. Compared with the existing verification-in-the-loop synthesis methods, our iterative framework is a sequential synthesis scheme of controllers and barrier certificates, which can learn safe controllers with adaptive barrier certificates rather than user-defined ones. We implement the tool SRLBC and evaluate its performance over a set of benchmark examples. The experimental results demonstrate that our approach efficiently synthesizes safe DNN controllers even for a nonlinear system with dimension up to 12.

Original languageEnglish
Title of host publicationComputer Aided Verification - 33rd International Conference, CAV 2021, Proceedings
EditorsAlexandra Silva, K. Rustan Leino
PublisherSpringer Science and Business Media Deutschland GmbH
Pages467-490
Number of pages24
ISBN (Print)9783030816841
DOIs
StatePublished - 2021
Event33rd International Conference on Computer Aided Verification, CAV 2021 - Virtual, Online
Duration: 20 Jul 202123 Jul 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12759 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference33rd International Conference on Computer Aided Verification, CAV 2021
CityVirtual, Online
Period20/07/2123/07/21

Keywords

  • Barrier certificates
  • Continuous dynamical systems
  • Formal verification
  • Safe reinforcement learning

Fingerprint

Dive into the research topics of 'An Iterative Scheme of Safe Reinforcement Learning for Nonlinear Systems via Barrier Certificate Generation'. Together they form a unique fingerprint.

Cite this