TY - GEN
T1 - ISpot
T2 - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
AU - Xu, Fei
AU - Jiang, Huan
AU - Zheng, Haoyue
AU - Shao, Wujie
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2018/5/25
Y1 - 2018/5/25
N2 - Achieving predictable performance for big data analytics running on cloud transient servers (e.g., EC2 spot instances) is challenging, because the transient server can be revoked by the cloud and the spot price is nontrivial to predict. Undoubtedly, choosing the low-price yet unstable cloud resources can severely degrade the job performance. To tackle this issue, this paper proposes iSpot, a cost-efficient spot instance provisioning framework in the cloud, by focusing on Spark as a representative DAG (Directed Acyclic Graph)-style big analytics workload. Specifically, it identifies the availability zones with stable spot instance resources by devising an accurate LSTM (Long Short-Term Memory)-based price prediction method. iSpot further predicts the performance of Spark stages and jobs by designing a fined-grained performance model using the job profiling and the DAG information of stages. Based on the price prediction and Spark performance model, iSpot is able to provision the spot instances with the cost-efficient instance type (i.e., the instance type that achieves the minimum monetary cost), in order to deliver predictable performance for big data analytics. Extensive prototype experiments on Amazon EC2 demonstrate that iSpot can guarantee the performance of big data analytics while reducing the job budget with cloud transient servers.
AB - Achieving predictable performance for big data analytics running on cloud transient servers (e.g., EC2 spot instances) is challenging, because the transient server can be revoked by the cloud and the spot price is nontrivial to predict. Undoubtedly, choosing the low-price yet unstable cloud resources can severely degrade the job performance. To tackle this issue, this paper proposes iSpot, a cost-efficient spot instance provisioning framework in the cloud, by focusing on Spark as a representative DAG (Directed Acyclic Graph)-style big analytics workload. Specifically, it identifies the availability zones with stable spot instance resources by devising an accurate LSTM (Long Short-Term Memory)-based price prediction method. iSpot further predicts the performance of Spark stages and jobs by designing a fined-grained performance model using the job profiling and the DAG information of stages. Based on the price prediction and Spark performance model, iSpot is able to provision the spot instances with the cost-efficient instance type (i.e., the instance type that achieves the minimum monetary cost), in order to deliver predictable performance for big data analytics. Extensive prototype experiments on Amazon EC2 demonstrate that iSpot can guarantee the performance of big data analytics while reducing the job budget with cloud transient servers.
KW - Big data analytics
KW - Cloud computing
KW - Cloud transient servers
KW - Predictable performance
KW - Spot instance provisioning
UR - https://www.scopus.com/pages/publications/85048369179
U2 - 10.1109/ISPA/IUCC.2017.00052
DO - 10.1109/ISPA/IUCC.2017.00052
M3 - 会议稿件
AN - SCOPUS:85048369179
T3 - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
SP - 314
EP - 321
BT - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
A2 - Martinez, Gregorio
A2 - Hill, Richard
A2 - Fox, Geoffrey
A2 - Mueller, Peter
A2 - Wang, Guojun
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 December 2017 through 15 December 2017
ER -