Cost-effective cloud server provisioning for predictable performance of big data analytics

Fei Xu, Haoyue Zheng, Huan Jiang, Wujie Shao, Haikun Liu, Zhi Zhou

Research output: Contribution to journalArticlepeer-review

42 Scopus citations

Abstract

Cloud datacenters are underutilized due to server over-provisioning. To increase datacenter utilization, cloud providers offer users an option to run workloads such as big data analytics on the underutilized resources, in the form of cheap yet revocable transient servers (e.g., EC2 spot instances, GCE preemptible instances). Though at highly reduced prices, deploying big data analytics on the unstable cloud transient servers can severely degrade the job performance due to instance revocations. To tackle this issue, this paper proposes iSpot, a cost-effective transient server provisioning framework for achieving predictable performance in the cloud, by focusing on Spark as a representative Directed Acyclic Graph (DAG)-style big data analytics workload. It first identifies the stable cloud transient servers during the job execution by devising an accurate Long Short-Term Memory (LSTM)-based price prediction method. Leveraging automatic job profiling and the acquired DAG information of stages, we further build an analytical performance model and present a lightweight critical data checkpointing mechanism for Spark, to enable our design of iSpot provisioning strategy for guaranteeing the job performance on stable transient servers. Extensive prototype experiments on both EC2 spot instances and GCE preemptible instances demonstrate that, iSpot is able to guarantee the performance of big data analytics running on cloud transient servers while reducing the job budget by up to 83.8 percent in comparison to the state-of-the-art server provisioning strategies, yet with acceptable runtime overhead.

Original languageEnglish
Article number8478347
Pages (from-to)1036-1051
Number of pages16
JournalIEEE Transactions on Parallel and Distributed Systems
Volume30
Issue number5
DOIs
StatePublished - 1 May 2019

Keywords

  • Predictable performance
  • big data analytics
  • cloud computing
  • data checkpointing
  • transient server provisioning

Fingerprint

Dive into the research topics of 'Cost-effective cloud server provisioning for predictable performance of big data analytics'. Together they form a unique fingerprint.

Cite this