Clustering with Entropy-based Recombination for Training GCNs on Large Graphs

Shangwei Wu, Yingtong Xiong, Hui Liang, Chuliang Weng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the development of deep learning methods on non-grid graph data, Graph Convolutional Networks (GCNs) are playing important roles in a wide range of scenarios. When dealing with graphs with ever-growing sizes, even if the compute capability of one GPU card is sufficient, its limited memory capacity would make the training on large graphs infeasible. Sampling methods on three levels (i.e., node-level, layer-level, and subgraph-level) have been proposed to improve the scalability of GCNs. However, there still exist drawbacks in sampling-based approaches including time-consuming sampling processes and biased node representations. To tackle these issues, we propose a novel subgraph-based sampling method considering the generalized distance of label distribution between each subgraph and the whole graph. Specifically, our method introduces two pre-steps before training: (1) partitioning all the nodes in the original graph into different clusters through an efficient clustering algorithm; (2) combining the clusters obtained in the first step into a set of bigger groups (subgraphs) based on the information entropy theory. Experiments show that our work could reserve similar label distribution to that on the whole graph and outperform SOTA models in terms of classification accuracy on different datasets. Besides, the time cost of our pre-processing procedure is acceptable compared with the time spent in training.

Original languageEnglish
Title of host publicationProceedings - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
EditorsJihe Wang, Yi He, Thang N. Dinh, Christan Grant, Meikang Qiu, Witold Pedrycz
PublisherIEEE Computer Society
Pages1170-1177
Number of pages8
ISBN (Electronic)9798350381641
DOIs
StatePublished - 2023
Event23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023 - Shanghai, China
Duration: 1 Dec 20234 Dec 2023

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
Country/TerritoryChina
CityShanghai
Period1/12/234/12/23

Keywords

  • graph clustering
  • graph convolutional networks
  • label entropy
  • recombination

Fingerprint

Dive into the research topics of 'Clustering with Entropy-based Recombination for Training GCNs on Large Graphs'. Together they form a unique fingerprint.

Cite this