TY - JOUR
T1 - Learning structured communication for multi-agent reinforcement learning
AU - Sheng, Junjie
AU - Wang, Xiangfeng
AU - Jin, Bo
AU - Yan, Junchi
AU - Li, Wenhao
AU - Chang, Tsung Hui
AU - Wang, Jun
AU - Zha, Hongyuan
N1 - Publisher Copyright:
© 2022, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2022/10
Y1 - 2022/10
N2 - This work explores the large-scale multi-agent communication mechanism for multi-agent reinforcement learning (MARL). We summarize the general topology categories for communication structures, which are often manually specified in MARL literature. A novel framework termed Learning Structured Communication (LSC) is proposed by learning a flexible and efficient communication topology (hierarchical structure). It contains two modules: structured communication module and communication-based policy module. The structured communication module learns to form a hierarchical structure by maximizing the cumulative reward of the agents under the current communication-based policy. The communication-based policy module adopts hierarchical graph neural networks to generate messages, propagate information based on the learned communication structure, and select actions. In contrast to existing communication mechanisms, our method has a learnable and hierarchical communication structure. Experiments on large-scale battle scenarios show that the proposed LSC has high communication efficiency and global cooperation capability.
AB - This work explores the large-scale multi-agent communication mechanism for multi-agent reinforcement learning (MARL). We summarize the general topology categories for communication structures, which are often manually specified in MARL literature. A novel framework termed Learning Structured Communication (LSC) is proposed by learning a flexible and efficient communication topology (hierarchical structure). It contains two modules: structured communication module and communication-based policy module. The structured communication module learns to form a hierarchical structure by maximizing the cumulative reward of the agents under the current communication-based policy. The communication-based policy module adopts hierarchical graph neural networks to generate messages, propagate information based on the learned communication structure, and select actions. In contrast to existing communication mechanisms, our method has a learnable and hierarchical communication structure. Experiments on large-scale battle scenarios show that the proposed LSC has high communication efficiency and global cooperation capability.
KW - Graph Neural Networks
KW - Hierarchical Structure
KW - Learning Communication Structures
KW - Multi-agent Reinforcement Learning
UR - https://www.scopus.com/pages/publications/85137109783
U2 - 10.1007/s10458-022-09580-8
DO - 10.1007/s10458-022-09580-8
M3 - 文章
AN - SCOPUS:85137109783
SN - 1387-2532
VL - 36
JO - Autonomous Agents and Multi-Agent Systems
JF - Autonomous Agents and Multi-Agent Systems
IS - 2
M1 - 50
ER -