Distance-aware virtual cluster performance optimization: A hadoop case study

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Cloud computing and big data are becoming two important developing trends in information technology area. However, data-intensive computing has some challenges to work well on virtual machines in cloud computing for virtualized resource competition and complex network communication. Network becomes one of the most notorious bottlenecks, which highlights strategies to lower communication and transmission cost in virtual cluster. In this paper, we present a novel cluster performance optimization strategy named vClusterOpt. vClusterOpt finds out centralized subgraphs of node graph and choose node with the shortest logical distance as kernel node of the subgraph to reduce inter-machine communication and transmission cost under virtual cluster. To calculate logical distance accurately, we define two kinds of logical distance: Logical Communication Distance(LCD) and Logical Transmission Distance(LTD). VM with the shortest LCD with others is used as the communication kernel node who has the most information communication stress, while VM with the shortest LTD is treated as transmission kernel node who has the most data transmission stress. We choose benchmarks running on Hadoop as the represent of data-intensive computing service to demonstrate effectiveness of our approach. Experiments show that an average of 20% performance improvement can get by our distance-aware virtual cluster optimization strategy.

Original languageEnglish
Title of host publication2013 IEEE International Conference on Cluster Computing, CLUSTER 2013
DOIs
StatePublished - 2013
Externally publishedYes
Event15th IEEE International Conference on Cluster Computing, CLUSTER 2013 - Indianapolis, IN, United States
Duration: 23 Sep 201327 Sep 2013

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Conference

Conference15th IEEE International Conference on Cluster Computing, CLUSTER 2013
Country/TerritoryUnited States
CityIndianapolis, IN
Period23/09/1327/09/13

Keywords

  • Hadoop
  • big data
  • cloud computing
  • distance-aware virtual cluster
  • virtual machine communication

Fingerprint

Dive into the research topics of 'Distance-aware virtual cluster performance optimization: A hadoop case study'. Together they form a unique fingerprint.

Cite this