TY - GEN
T1 - Parallel K-means clustering of remote sensing images based on mapreduce
AU - Lv, Zhenhua
AU - Hu, Yingjie
AU - Zhong, Haidong
AU - Wu, Jianping
AU - Li, Bo
AU - Zhao, Hui
PY - 2010
Y1 - 2010
N2 - The K-Means clustering is a basic method in analyzing RS (remote sensing) images, which generates a direct overview of objects. Usually, such work can be done by some software (e.g. ENVI, ERDAS IMAGINE) in personal computers. However, for PCs, the limitation of hardware resources and the tolerance of time consuming present a bottleneck in processing a large amount of RS images. The techniques of parallel computing and distributed systems are no doubt the suitable choices. Different with traditional ways, in this paper we try to parallel this algorithm on Hadoop, an open source system that implements the MapReduce programming model. The paper firstly describes the color representation of RS images, which means pixels need to be translated into a particular color space CIELAB that is more suitable for distinguishing colors. It also gives an overview of traditional K-Means. Then the programming model MapReduce and a platform Hadoop are briefly introduced. This model requires customized 'map/reduce' functions, allowing users to parallel processing in two stages. In addition, the paper detail map and reduce functions by pseudo-codes, and the reports of performance based on the experiments are given. The paper shows that results are acceptable and may also inspire some other approaches of tackling similar problems within the field of remote sensing applications.
AB - The K-Means clustering is a basic method in analyzing RS (remote sensing) images, which generates a direct overview of objects. Usually, such work can be done by some software (e.g. ENVI, ERDAS IMAGINE) in personal computers. However, for PCs, the limitation of hardware resources and the tolerance of time consuming present a bottleneck in processing a large amount of RS images. The techniques of parallel computing and distributed systems are no doubt the suitable choices. Different with traditional ways, in this paper we try to parallel this algorithm on Hadoop, an open source system that implements the MapReduce programming model. The paper firstly describes the color representation of RS images, which means pixels need to be translated into a particular color space CIELAB that is more suitable for distinguishing colors. It also gives an overview of traditional K-Means. Then the programming model MapReduce and a platform Hadoop are briefly introduced. This model requires customized 'map/reduce' functions, allowing users to parallel processing in two stages. In addition, the paper detail map and reduce functions by pseudo-codes, and the reports of performance based on the experiments are given. The paper shows that results are acceptable and may also inspire some other approaches of tackling similar problems within the field of remote sensing applications.
KW - Hadoop
KW - K-Means
KW - MapReduce
KW - Parallel
KW - Remote sensing
UR - https://www.scopus.com/pages/publications/78649523622
U2 - 10.1007/978-3-642-16515-3_21
DO - 10.1007/978-3-642-16515-3_21
M3 - 会议稿件
AN - SCOPUS:78649523622
SN - 3642165141
SN - 9783642165146
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 162
EP - 170
BT - Web Information Systems and Mining - International Conference, WISM 2010, Proceedings
T2 - 2010 International Conference on Web Information Systems and Mining, WISM 2010
Y2 - 23 October 2010 through 24 October 2010
ER -