Abstract
Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph Neural Networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be diffcult to scale up. Most scalable models apply node-based techniques in simplifying the expensive graph message-passing propagation procedure of GNN. However, we find such acceleration insuffcient when applied to million-or even billion-scale graphs. In this work, we propose SCARA, a scalable GNN with feature-oriented optimization for graph computation. SCARA effciently computes graph embedding from node features, and further selects and reuses feature computation results to reduce overhead. Theoretical analysis indicates that our model achieves sub-linear time complexity with a guaranteed precision in propagation process as well as GNN training and inference. We conduct extensive experiments on various datasets to evaluate the effcacy and effciency of SCARA. Performance comparison with baselines shows that SCARA can reach up to 100× graph propagation acceleration than current state-of-the-art methods with fast convergence and comparable accuracy. Most notably, it is effcient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.
| Original language | English |
|---|---|
| Pages (from-to) | 3240-3248 |
| Number of pages | 9 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 15 |
| Issue number | 11 |
| DOIs | |
| State | Published - 2022 |
| Event | 48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia Duration: 5 Sep 2022 → 9 Sep 2022 |