TY - GEN
T1 - Safe Reinforcement Learning via Probabilistic Timed Computation Tree Logic
AU - Qian, Li
AU - Liu, Jing
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Reinforcement learning aims to discover an optimal policy that maximizes reward based on the feedback signal. Although the method succeeds in numerous systems, it may not apply to safe-critical systems due to the absence of safety protection mechanism. Besides, the agent is unable to model the environment accurately if getting biased observation. We present a safe algorithm called Safe Control with Supervisor (SCS) for addressing the limitation. If the model is accurate, the supervisor monitors the system and repairs the action of the agent at runtime, which guides the system to obey the specification described by probabilistic timed Computation Tree Logic (ptCTL). If not, the supervisor would maximize the probability of satisfying a given task specification. We validate our method through experiments of adaptive cruise control under uncertainty.
AB - Reinforcement learning aims to discover an optimal policy that maximizes reward based on the feedback signal. Although the method succeeds in numerous systems, it may not apply to safe-critical systems due to the absence of safety protection mechanism. Besides, the agent is unable to model the environment accurately if getting biased observation. We present a safe algorithm called Safe Control with Supervisor (SCS) for addressing the limitation. If the model is accurate, the supervisor monitors the system and repairs the action of the agent at runtime, which guides the system to obey the specification described by probabilistic timed Computation Tree Logic (ptCTL). If not, the supervisor would maximize the probability of satisfying a given task specification. We validate our method through experiments of adaptive cruise control under uncertainty.
KW - Probabilistic timed computation tree logic
KW - Reinforcement learning
KW - Safe control
UR - https://www.scopus.com/pages/publications/85093862133
U2 - 10.1109/IJCNN48605.2020.9207384
DO - 10.1109/IJCNN48605.2020.9207384
M3 - 会议稿件
AN - SCOPUS:85093862133
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Joint Conference on Neural Networks, IJCNN 2020
Y2 - 19 July 2020 through 24 July 2020
ER -