Demonstration on Unblocking Checkpoint for Fault-Tolerance in Pregel-Like Systems

  • Zhenhua Yang
  • , Yi Yang
  • , Chen Xu*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Pregel-like systems are developed to execute iterative applications on massive graph data, which often leads to a long runtime. These systems are usually deployed on a cluster of commodity servers, where failures are common. Hence, fault-tolerance is crucial for them. A typical fault-tolerance technique is checkpointing, which can be achieved in a blocking or an unblocking manner. Blocking checkpointing incurs notable overhead as it pauses iterative computation. Unblocking checkpointing decrease the overhead by parallelizing checkpointing and iterative computation. However, it introduces resource contention due to parallel checkpointing tasks, which may prolong overall execution time. The queuing strategy and the staleness/tardiness-aware skipping policy can effectively improve unblocking checkpointing by alleviating the resource contention and selecting an optimal checkpoint from the queued checkpoints, respectively. In this demonstration, we showcase their internal mechanisms based on Apache Giraph.

Original languageEnglish
Title of host publicationWeb and Big Data - 6th International Joint Conference, APWeb-WAIM 2022, Proceedings
EditorsBohan Li, Chuanqi Tao, Lin Yue, Xuming Han, Diego Calvanese, Toshiyuki Amagasa
PublisherSpringer Science and Business Media Deutschland GmbH
Pages456-460
Number of pages5
ISBN (Print)9783031252006
DOIs
StatePublished - 2023
Event6th International Joint Conference on Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM), APWeb-WAIM 2022 - Nanjing, China
Duration: 25 Nov 202227 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13423 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Joint Conference on Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM), APWeb-WAIM 2022
Country/TerritoryChina
CityNanjing
Period25/11/2227/11/22

Keywords

  • Checkpoint
  • Fault tolerance
  • Graph processing

Fingerprint

Dive into the research topics of 'Demonstration on Unblocking Checkpoint for Fault-Tolerance in Pregel-Like Systems'. Together they form a unique fingerprint.

Cite this