TY - GEN
T1 - An adaptive auto-configuration tool for hadoop
AU - Li, Changlong
AU - Zhuang, Hang
AU - Lu, Kun
AU - Sun, Mingming
AU - Zhou, Jinhong
AU - Dai, Dong
AU - Zhou, Xuehai
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/10/13
Y1 - 2014/10/13
N2 - With the coming concept of 'big data', the ability to handle large datasets has become a critical consideration for the success of industrial organizations such as Google, Amazon, Yahoo! and Facebook. As an important Cloud Computing framework for bulk data processing, Hadoop is widely used in these organizations. However, the performance of MapReduce is seriously limited by its stiff configuration strategy. Even for a single simple job in Hadoop, a large number of tuning parameters have to be set by users. This may easily lead to performance loss due to some misconfigurations. In this paper, we present an adaptive automatic configuration tool (AACT) for Hadoop to achieve performance optimization. To achieve this goal, we propose a mathematical model which will accurately learn the relationship between system performance and configuration parameters, then configure Hadoop system based on this mathematical model. With the help of AACT, Hadoop is able to adapt the hardware and software configurations dynamically and drive the system to an optimal configuration in acceptable time. Experimental results show its efficiency and adaptability, and that it is ten times faster compared with default configuration.
AB - With the coming concept of 'big data', the ability to handle large datasets has become a critical consideration for the success of industrial organizations such as Google, Amazon, Yahoo! and Facebook. As an important Cloud Computing framework for bulk data processing, Hadoop is widely used in these organizations. However, the performance of MapReduce is seriously limited by its stiff configuration strategy. Even for a single simple job in Hadoop, a large number of tuning parameters have to be set by users. This may easily lead to performance loss due to some misconfigurations. In this paper, we present an adaptive automatic configuration tool (AACT) for Hadoop to achieve performance optimization. To achieve this goal, we propose a mathematical model which will accurately learn the relationship between system performance and configuration parameters, then configure Hadoop system based on this mathematical model. With the help of AACT, Hadoop is able to adapt the hardware and software configurations dynamically and drive the system to an optimal configuration in acceptable time. Experimental results show its efficiency and adaptability, and that it is ten times faster compared with default configuration.
KW - Auto-Configuration
KW - Hadoop
KW - Self-Learning
UR - https://www.scopus.com/pages/publications/84908413586
U2 - 10.1109/ICECCS.2014.17
DO - 10.1109/ICECCS.2014.17
M3 - 会议稿件
AN - SCOPUS:84908413586
T3 - Proceedings of the IEEE International Conference on Engineering of Complex Computer Systems, ICECCS
SP - 69
EP - 72
BT - 2014 19th International Conference on Engineering of Complex Computer Systems, ICECCS 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th International Conference on Engineering of Complex Computer Systems, ICECCS 2014
Y2 - 4 August 2014 through 7 August 2014
ER -