Clustering with Entropy-based Recombination for Training GCNs on Large Graphs

  • Shangwei Wu*
  • , Yingtong Xiong
  • , Hui Liang
  • , Chuliang Weng
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

With the development of deep learning methods on non-grid graph data, Graph Convolutional Networks (GCNs) are playing important roles in a wide range of scenarios. When dealing with graphs with ever-growing sizes, even if the compute capability of one GPU card is sufficient, its limited memory capacity would make the training on large graphs infeasible. Sampling methods on three levels (i.e., node-level, layer-level, and subgraph-level) have been proposed to improve the scalability of GCNs. However, there still exist drawbacks in sampling-based approaches including time-consuming sampling processes and biased node representations. To tackle these issues, we propose a novel subgraph-based sampling method considering the generalized distance of label distribution between each subgraph and the whole graph. Specifically, our method introduces two pre-steps before training: (1) partitioning all the nodes in the original graph into different clusters through an efficient clustering algorithm; (2) combining the clusters obtained in the first step into a set of bigger groups (subgraphs) based on the information entropy theory. Experiments show that our work could reserve similar label distribution to that on the whole graph and outperform SOTA models in terms of classification accuracy on different datasets. Besides, the time cost of our pre-processing procedure is acceptable compared with the time spent in training.

源语言英语
主期刊名Proceedings - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
编辑Jihe Wang, Yi He, Thang N. Dinh, Christan Grant, Meikang Qiu, Witold Pedrycz
出版商IEEE Computer Society
1170-1177
页数8
ISBN(电子版)9798350381641
DOI
出版状态已出版 - 2023
活动23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023 - Shanghai, 中国
期限: 1 12月 20234 12月 2023

出版系列

姓名IEEE International Conference on Data Mining Workshops, ICDMW
ISSN(印刷版)2375-9232
ISSN(电子版)2375-9259

会议

会议23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
国家/地区中国
Shanghai
时期1/12/234/12/23

指纹

探究 'Clustering with Entropy-based Recombination for Training GCNs on Large Graphs' 的科研主题。它们共同构成独一无二的指纹。

引用此