TY - GEN
T1 - Automatic discovery and transfer of MAXQ hierarchies in a complex system
AU - Wang, Hongbing
AU - Li, Wenya
AU - Zhou, Xuan
PY - 2012
Y1 - 2012
N2 - Reinforcement learning has been an important category of machine learning approaches exhibiting self-learning and online learning characteristics. Using reinforcement learning, an agent can learn its behaviors through trial-and-error interactions with a dynamic environment and finally come up with an optimal strategy. Reinforcement learning suffers the curse of dimensionality, though there has been significant progress to overcome this issue in recent years. MAXQ is one of the most common approaches for reinforcement learning. To function properly, MAXQ requires a decomposition of the agent's task into a task hierarchy. Previously, the decomposition can only be done manually. In this paper, we propose a mechanism for automatic subtask discovery. The mechanism applies clustering to automatically construct task hierarchy required by MAXQ, such that MAXQ can be fully automated. We present the design of our mechanism, and demonstrate its effectiveness through theoretical analysis and an extensive experimental evaluation.
AB - Reinforcement learning has been an important category of machine learning approaches exhibiting self-learning and online learning characteristics. Using reinforcement learning, an agent can learn its behaviors through trial-and-error interactions with a dynamic environment and finally come up with an optimal strategy. Reinforcement learning suffers the curse of dimensionality, though there has been significant progress to overcome this issue in recent years. MAXQ is one of the most common approaches for reinforcement learning. To function properly, MAXQ requires a decomposition of the agent's task into a task hierarchy. Previously, the decomposition can only be done manually. In this paper, we propose a mechanism for automatic subtask discovery. The mechanism applies clustering to automatically construct task hierarchy required by MAXQ, such that MAXQ can be fully automated. We present the design of our mechanism, and demonstrate its effectiveness through theoretical analysis and an extensive experimental evaluation.
KW - Clustering
KW - MAXQ
KW - Reinforcement Learning
KW - System of Systems
UR - https://www.scopus.com/pages/publications/84876831101
U2 - 10.1109/ICTAI.2012.165
DO - 10.1109/ICTAI.2012.165
M3 - 会议稿件
AN - SCOPUS:84876831101
SN - 9780769549156
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 1157
EP - 1162
BT - Proceedings - 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012
T2 - 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012
Y2 - 7 November 2012 through 9 November 2012
ER -