Weighted Mean-Field Multi-Agent Reinforcement Learning via Reward Attribution Decomposition

Tingyu Wu, Wenhao Li, Bo Jin, Wei Zhang, Xiangfeng Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Existing MARL algorithms have low efficiency in many-agent scenarios due to the complex dynamic interaction when agents growing exponentially. Mean-field theory has been introduced to improve the scalability where complex interactions are approximated by those between a single agent and the mean effect from neighbors. However, only considering the averaged actions of neighborhood at last step and ignoring the dynamic influence of neighbors leads to unstable training procedures and sub-optimal solutions. In this paper, the Weighted Mean-Field Multi-Agent Reinforcement Learning via Reward Attribution Decomposition (MFRAD) framework is proposed by differentiating heterogeneous and hysteresis neighbor effect with weighted mean-field approximation and reward attribution decomposition. The multi-head attention is employed to calculate the weights which formulate the weighted mean-field Q-function. To further eliminate the impact of hysteresis information, reward attribution decomposition is integrated to decompose weighted mean-field Q-value, improving the interpretability of MFRAD and achieving fully decentralized execution without information exchanging. Two novel regularization terms are also introduced to guarantee the consistency of temporal relationship among agents and unambiguity of local Q-value with no agents. Numerical experiments on many-agent scenarios demonstrate the superior performance against existing baselines.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications. DASFAA 2022 International Workshops - BDMS, BDQM, GDMA, IWBT, MAQTDS, and PMBD, Proceedings
EditorsUday Kiran Rage, Vikram Goyal, P. Krishna Reddy
PublisherSpringer Science and Business Media Deutschland GmbH
Pages301-316
Number of pages16
ISBN (Print)9783031112164
DOIs
StatePublished - 2022
EventInternational Workshops on BDMS, BDQM, GDMA, IWBT, MAQTDS, and PMBD 2022, held in conjunction with the 27th International Conference on Database Systems for Advanced Applications, DASFAA 2022 - Virtual, Online
Duration: 11 Apr 202214 Apr 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13248 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Workshops on BDMS, BDQM, GDMA, IWBT, MAQTDS, and PMBD 2022, held in conjunction with the 27th International Conference on Database Systems for Advanced Applications, DASFAA 2022
CityVirtual, Online
Period11/04/2214/04/22

Keywords

  • Multi-agent reinforcement learning
  • Reward attribution decomposition
  • Weighted mean-field approximation

Fingerprint

Dive into the research topics of 'Weighted Mean-Field Multi-Agent Reinforcement Learning via Reward Attribution Decomposition'. Together they form a unique fingerprint.

Cite this