Hybrid Checkpointing for Iterative Processing in BSP-Based Systems

Yi Yang, Chen Xu, Chao Kong, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Distributed iterative processing exists in various application scenarios including large-scale graph analytics and machine learning. Many systems employ bulk synchronous parallel (BSP) model to synchronize the iterations. In these BSP-based systems, the long iterative processing time in distributed environments makes the fault-tolerance crucial. Most BSP-based systems write a checkpoint in either blocking strategy or unblocking strategy to achieve fault-tolerance. However, the blocking strategy involves a checkpointing overhead in failure-free cases, whereas the unblocking strategy also incurs a recovery cost if the BSP-based system has not completed checkpointing in failure cases. Motivated by the trade-off between blocking and unblocking checkpointing, we aim to choose different checkpointing strategy when checkpoint is required during iterative processing, in order to reduce the whole execution time. In particular, we propose a checkpointing choice problem, i.e., how to choose the strategy to minimize the execution time. The challenge is to make a choice during runtime without future information. To address this problem, we provide a hybrid checkpointing, which heuristically chooses either blocking or unblocking checkpointing based on cost evaluation. Our experiments on Giraph, a typical BSP-based system, show that hybrid checkpointing outperforms blocking and unblocking checkpointing.

Original languageEnglish
Title of host publicationWeb Information Systems and Applications - 18th International Conference, WISA 2021, Proceedings
EditorsChunxiao Xing, Xiaoming Fu, Yong Zhang, Guigang Zhang, Chaolemen Borjigin
PublisherSpringer Science and Business Media Deutschland GmbH
Pages693-705
Number of pages13
ISBN (Print)9783030875701
DOIs
StatePublished - 2021
Event18th International Conference on Web Information Systems and Applications, WISA 2021 - Kaifeng, China
Duration: 24 Sep 202126 Sep 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12999 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Web Information Systems and Applications, WISA 2021
Country/TerritoryChina
CityKaifeng
Period24/09/2126/09/21

Keywords

  • BSP
  • Checkpointing
  • Iterative processing

Fingerprint

Dive into the research topics of 'Hybrid Checkpointing for Iterative Processing in BSP-Based Systems'. Together they form a unique fingerprint.

Cite this