Towards Efficient Workflow Scheduling Over Yarn Cluster Using Deep Reinforcement Learning

Jianguo Xue, Ting Wang, Puyu Cai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Hadoop Yarn is an open-source cluster manager responsible for resource management and job scheduling. However, data-driven applications are typically organized into workflows that consist of a series of jobs with dependencies. Yarn does not manage users' workflows and only considers the current job rather than the entire workflow when scheduling. In practice, multiple workflows share the same Yarn cluster and are pre-assigned separate Yarn resource queues to avoid mutual interference. However, this coarse-grained resource division can sometimes result in low resource utilization and increased pending time of jobs on the Yarn queue. For instance, one resource queue may have exhausted its quota while still having pending jobs, while other queues may have available resources but cannot begin executing any jobs due to unfulfilled data dependencies. To address this problem, we propose a deep reinforcement learning-based workflow scheduling scheme that takes into account job dependencies, job priorities, and dynamic resource usage. The proposed approach can intelligently identify and utilize free windows of different resource queues. Our simulation results demonstrate that the proposed DRL-based workflow scheduling scheme can significantly reduce the average job latency compared to existing approaches.

Original languageEnglish
Title of host publicationGLOBECOM 2023 - 2023 IEEE Global Communications Conference
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages473-478
Number of pages6
ISBN (Electronic)9798350310900
DOIs
StatePublished - 2023
Event2023 IEEE Global Communications Conference, GLOBECOM 2023 - Kuala Lumpur, Malaysia
Duration: 4 Dec 20238 Dec 2023

Publication series

NameProceedings - IEEE Global Communications Conference, GLOBECOM
ISSN (Print)2334-0983
ISSN (Electronic)2576-6813

Conference

Conference2023 IEEE Global Communications Conference, GLOBECOM 2023
Country/TerritoryMalaysia
CityKuala Lumpur
Period4/12/238/12/23

Keywords

  • Deep Reinforcement Learning
  • Workflow Scheduling
  • Yarn Cluster

Fingerprint

Dive into the research topics of 'Towards Efficient Workflow Scheduling Over Yarn Cluster Using Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this