跳到主要导航 跳到搜索 跳到主要内容

Hierarchical Multiagent Reinforcement Learning for Allocating Guaranteed Display Ads

  • Lu Wang
  • , Lei Han
  • , Xinru Chen
  • , Chengchang Li
  • , Junzhou Huang
  • , Weinan Zhang
  • , Wei Zhang*
  • , Xiaofeng He*
  • , Dijun Luo
  • *此作品的通讯作者
  • East China Normal University
  • Tencent
  • Shanghai Jiao Tong University

科研成果: 期刊稿件文章同行评审

摘要

In this article, we study the problem of guaranteed display ads (GDAs) allocation, which requires proactively allocate display ads to different impressions to fulfill their impression demands indicated in the contracts. Existing methods for this problem either assume the impressions that are static or solely consider a specific ad's benefits. Thus, it is hard to generalize to the industrial production scenario where the impressions are dynamical and large-scale, and the overall allocation optimality of all the considered GDAs is required. To bridge this gap, we formulate this problem as a sequential decision-making problem in the scope of multiagent reinforcement learning (MARL), by assigning an allocation agent to each ad and coordinating all the agents for allocating GDAs. The inputs are the states (e.g., the demands of the ad and the remaining time steps for displaying the ads) of each ad and the impressions at different time steps, and the outputs are the display ratios of each ad for each impression. Specifically, we propose a novel hierarchical MARL (HMARL) method that creates hierarchies over the agent policies to handle a large number of ads and the dynamics of impressions. HMARL contains: 1) a manager policy to navigate the agent to choose an appropriate subpolicy and 2) a set of subpolicies that let the agents perform diverse conditioning on their states. Extensive experiments on three real-world data sets from the Tencent advertising platform with tens of millions of records demonstrate significant improvements of HMARL over state-of-the-art approaches.

源语言英语
页(从-至)5361-5373
页数13
期刊IEEE Transactions on Neural Networks and Learning Systems
33
10
DOI
出版状态已出版 - 1 10月 2022

指纹

探究 'Hierarchical Multiagent Reinforcement Learning for Allocating Guaranteed Display Ads' 的科研主题。它们共同构成独一无二的指纹。

引用此