Distributed fuzzy rough set for big data analysis in cloud computing

Wenhao Qu, Linghe Kong, Kaishun Wu, Feilong Tang, Guihai Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Fuzzy rough set based feature selection is a widely adopted technique for big data analysis. However, the high accuracy of this technique depends on all the data correlations, so that it always works in the centralized computing mode. With the increasing data volume, the centralized server, especially its computation capability and memory space, cannot afford the computing of fuzzy rough set. To enable the fuzzy rough set for big data analysis, in this paper, we propose the novel Distributed Fuzzy Rough Set (DFRS) based feature selection in cloud computing, which separates and assigns the tasks to multiple nodes for parallel computing. The key challenge is to maintain the global information on each distributed node without conserving the entire fuzzy relation matrix. We tackle this challenge by a dynamic data decomposition algorithm and a data summarization process on each distributed node. Extensive experiments based on multiple real datasets demonstrate that DFRS significantly improves the runtime and its feature selection accuracy is nearly the same as the traditional centralized computing.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE 25th International Conference on Parallel and Distributed Systems, ICPADS 2019
PublisherIEEE Computer Society
Pages109-116
Number of pages8
ISBN (Electronic)9781728125831
DOIs
StatePublished - Dec 2019
Externally publishedYes
Event25th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2019 - Tianjin, China
Duration: 4 Dec 20196 Dec 2019

Publication series

NameProceedings of the International Conference on Parallel and Distributed Systems - ICPADS
Volume2019-December
ISSN (Print)1521-9097

Conference

Conference25th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2019
Country/TerritoryChina
CityTianjin
Period4/12/196/12/19

Keywords

  • Big data
  • Distributed feature selection
  • Dynamic data decomposition
  • Fuzzy rough sets

Fingerprint

Dive into the research topics of 'Distributed fuzzy rough set for big data analysis in cloud computing'. Together they form a unique fingerprint.

Cite this