跳到主要导航 跳到搜索 跳到主要内容

Learning Optimal “Pigovian Tax” in Sequential Social Dilemmas

  • Yun Hua
  • , Bo Jin
  • , Shang Gao
  • , Xiangfeng Wang*
  • , Wenhao Li
  • , Hongyuan Zha
  • *此作品的通讯作者
  • East China Normal University
  • Tongji University
  • The Chinese University of Hong Kong, Shenzhen

科研成果: 期刊稿件会议文章同行评审

摘要

In multi-agent reinforcement learning (MARL), each agent acts to maximize its individual accumulated rewards. Nevertheless, individual accumulated rewards could not fully reflect how others perceive them, resulting in selfish behaviors that undermine global performance, which brings the social dilemmas. This paper adapt the famous externality theory in economic area to analyze social dilemmas in MARL, and propose the method called Learning Optimal Pigovian Tax (LOPT) to internalize the externalities in MARL. Furthermore, a reward shaping mechanism based on the approximated optimal “Pigovian Tax” is applied to reduce the social cost of each agent and tries to alleviate the social dilemmas. Compared with existing state-of-the-art methods, the proposed LOPT leads to higher collective social welfare in both the Escape Room and the Cleanup environments, which shows the superiority of our method in solving social dilemmas.

源语言英语
页(从-至)2784-2786
页数3
期刊Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
2023-May
出版状态已出版 - 2023
活动22nd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2023 - London, 英国
期限: 29 5月 20232 6月 2023

指纹

探究 'Learning Optimal “Pigovian Tax” in Sequential Social Dilemmas' 的科研主题。它们共同构成独一无二的指纹。

引用此