TY - JOUR
T1 - Interpretable policy derivation for reinforcement learning based on evolutionary feature synthesis
AU - Zhang, Hengzhe
AU - Zhou, Aimin
AU - Lin, Xin
N1 - Publisher Copyright:
© 2020, The Author(s).
PY - 2020/10
Y1 - 2020/10
N2 - Reinforcement learning based on the deep neural network has attracted much attention and has been widely used in real-world applications. However, the black-box property limits its usage from applying in high-stake areas, such as manufacture and healthcare. To deal with this problem, some researchers resort to the interpretable control policy generation algorithm. The basic idea is to use an interpretable model, such as tree-based genetic programming, to extract policy from other black box modes, such as neural networks. Following this idea, in this paper, we try yet another form of the genetic programming technique, evolutionary feature synthesis, to extract control policy from the neural network. We also propose an evolutionary method to optimize the operator set of the control policy for each specific problem automatically. Moreover, a policy simplification strategy is also introduced. We conduct experiments on four reinforcement learning environments. The experiment results reveal that evolutionary feature synthesis can achieve better performance than tree-based genetic programming to extract policy from the neural network with comparable interpretability.
AB - Reinforcement learning based on the deep neural network has attracted much attention and has been widely used in real-world applications. However, the black-box property limits its usage from applying in high-stake areas, such as manufacture and healthcare. To deal with this problem, some researchers resort to the interpretable control policy generation algorithm. The basic idea is to use an interpretable model, such as tree-based genetic programming, to extract policy from other black box modes, such as neural networks. Following this idea, in this paper, we try yet another form of the genetic programming technique, evolutionary feature synthesis, to extract control policy from the neural network. We also propose an evolutionary method to optimize the operator set of the control policy for each specific problem automatically. Moreover, a policy simplification strategy is also introduced. We conduct experiments on four reinforcement learning environments. The experiment results reveal that evolutionary feature synthesis can achieve better performance than tree-based genetic programming to extract policy from the neural network with comparable interpretability.
KW - Explainable machine learning
KW - Genetic programming
KW - Policy derivation
KW - Reinforcement learning
UR - https://www.scopus.com/pages/publications/85134057232
U2 - 10.1007/s40747-020-00175-y
DO - 10.1007/s40747-020-00175-y
M3 - 文章
AN - SCOPUS:85134057232
SN - 2199-4536
VL - 6
SP - 741
EP - 753
JO - Complex and Intelligent Systems
JF - Complex and Intelligent Systems
IS - 3
ER -