Dynamic Conservative Degree Allocation for Offline Multi-Agent Reinforcement Learning

  • Haosheng Chen
  • , Yun Hua*
  • , Junjie Sheng
  • , Wenhao Li
  • , Bo Jin
  • , Xiangfeng Wang
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Offline Multi-agent Reinforcement Learning (MARL) has been designed to learn policies from pre-collected datasets without real-time interaction in multi-agent systems. A primary concern in offline MARL is the conservative degree allocation, which involves assigning different conservatism levels to agents based on their varying influence on the system. Current approaches frequently neglect this crucial aspect, resulting in suboptimal performance, particularly when agents have differing impacts on the environment. In this paper, we propose OMCDA, a novel offline MARL algorithm that addresses the issue of conservative degree allocation by assigning dynamic conservatism levels to each agent based on their individual influence on system performance. OMCDA decomposes the Q-function into two components: one for computing the return and another for capturing deviations from the behavior policy. Additionally, OMCDA employs a dynamic allocation mechanism that adjusts conservatism levels for agents based on varying impacts, while maintaining coherent credit assignment and ensuring robust system performance throughout learning. We evaluate OMCDA on MuJoCo and SMAC, showing it outperforms existing offline MARL methods in challenging tasks by effectively addressing conservative degree allocation.

Original languageEnglish
Title of host publicationProceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2025
EditorsYevgeniy Vorobeychik, Sanmay Das, Ann Nowe
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages2457-2459
Number of pages3
ISBN (Electronic)9798400714269
StatePublished - 2025
Event24th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2025 - Detroit, United States
Duration: 19 May 202523 May 2025

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

Conference24th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2025
Country/TerritoryUnited States
CityDetroit
Period19/05/2523/05/25

Keywords

  • Multi-agent reinforcement learning
  • Offline reinforcement learning

Fingerprint

Dive into the research topics of 'Dynamic Conservative Degree Allocation for Offline Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this