TY - JOUR
T1 - CERT-DF
T2 - A Computing-Efficient and Robust Distributed Deep Forest Framework With Low Communication Overhead
AU - Xie, Li'an
AU - Wang, Ting
AU - Du, Shuyi
AU - Cai, Haibin
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2023/12/1
Y1 - 2023/12/1
N2 - As an alternative to the deep learning model, deep forest outperforms deep neural networks in many aspects with fewer hyperparameters and better robustness. To improve the computing performance of deep forest, ForestLayer proposes an efficient task-parallel algorithm S-FTA at a fine sub-forest granularity, but the granularity of the sub-forest cannot be adaptively adjusted. BLB-gcForest further proposes an adaptive sub-forest splitting algorithm to dynamically adjust the sub-forest granularity. However, with distributed storage, its BLB method needs to scan the whole dataset when sampling, which generates considerable communication overhead. Moreover, BLB-gcForest's tree-based vector aggregation produces extensive redundant transfers and significantly degrades the system's performance in vector aggregation stage. To deal with these existing issues and further improve the computing efficiency and scalability of the distributed deep forest, in this paper, we propose a novel Computing-Efficient and RobusT distributed Deep Forest framework, named CERT-DF. CERT-DF integrates three customized schemes, namely, block-level pre-sampling, two-stage pre-aggregation, and system-level backup. Specifically, CERT-DF adopts the block-level pre-sampling method to implement data blocks' local sampling eliminating frequent data remote access and maximizing parallel efficiency, applies the two-stage pre-aggregation method to adjust the class vector aggregation granularity to greatly decrease the communication overhead, and leverages the system-level backup method to enhance the system's disaster tolerance and immensely accelerate task recovery with minimal system resource overhead. Comprehensive experimental evaluations on multiple datasets show that our CERT-DF significantly outperforms the state-of-the-art approaches with higher computing efficiency, lower system resource overhead, and better system robustness while ensuring good accuracy.
AB - As an alternative to the deep learning model, deep forest outperforms deep neural networks in many aspects with fewer hyperparameters and better robustness. To improve the computing performance of deep forest, ForestLayer proposes an efficient task-parallel algorithm S-FTA at a fine sub-forest granularity, but the granularity of the sub-forest cannot be adaptively adjusted. BLB-gcForest further proposes an adaptive sub-forest splitting algorithm to dynamically adjust the sub-forest granularity. However, with distributed storage, its BLB method needs to scan the whole dataset when sampling, which generates considerable communication overhead. Moreover, BLB-gcForest's tree-based vector aggregation produces extensive redundant transfers and significantly degrades the system's performance in vector aggregation stage. To deal with these existing issues and further improve the computing efficiency and scalability of the distributed deep forest, in this paper, we propose a novel Computing-Efficient and RobusT distributed Deep Forest framework, named CERT-DF. CERT-DF integrates three customized schemes, namely, block-level pre-sampling, two-stage pre-aggregation, and system-level backup. Specifically, CERT-DF adopts the block-level pre-sampling method to implement data blocks' local sampling eliminating frequent data remote access and maximizing parallel efficiency, applies the two-stage pre-aggregation method to adjust the class vector aggregation granularity to greatly decrease the communication overhead, and leverages the system-level backup method to enhance the system's disaster tolerance and immensely accelerate task recovery with minimal system resource overhead. Comprehensive experimental evaluations on multiple datasets show that our CERT-DF significantly outperforms the state-of-the-art approaches with higher computing efficiency, lower system resource overhead, and better system robustness while ensuring good accuracy.
KW - Big Data bootstrap
KW - Deep forest
KW - distributed AI
KW - distributed computing
UR - https://www.scopus.com/pages/publications/85174844653
U2 - 10.1109/TPDS.2023.3324911
DO - 10.1109/TPDS.2023.3324911
M3 - 文章
AN - SCOPUS:85174844653
SN - 1045-9219
VL - 34
SP - 3280
EP - 3293
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 12
ER -