Parallel subgraph listing in a large-scale graph

  • Yingxia Shao
  • , Bin Cui
  • , Lei Chen
  • , Lin Ma
  • , Junjie Yao
  • , Ning Xu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

102 Scopus citations

Abstract

Subgraph listing is a fundamental operation to many graph and network analyses. The problem itself is computationally expensive and is well-studied in centralized processing algorithms. However, the centralized solutions cannot scale well to large graphs. Recently, several parallel approaches are introduced to handle the large graphs. Unfortunately, these parallel approaches still rely on the expensive join operations, thus cannot achieve high performance. In this paper, we design a novel parallel subgraph listing framework, named PSgL. The PSgL iteratively enumerates subgraph instances and solves the subgraph listing in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, we propose several solutions to balance the workload and reduce the size of intermediate results. Specially, we prove the problem of partial subgraph instance distribution for workload balance is NP-hard, and carefully design a set of heuristic strategies. To further reduce the enormous intermediate results, we introduce three independent mechanisms, which are automorphism breaking of the pattern graph, initial pattern vertex selection based on a cost model, and a pruning method based on a light-weight index. We have implemented the prototype of PSgL, and run compre- hensive experiments of various graph listing operations on diverse large graphs. The experiments clearly demonstrate that PSgL is robust and can achieve performance gain over the state-of-the-art solutions up to 90%.

Original languageEnglish
Title of host publicationSIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages625-636
Number of pages12
ISBN (Print)9781450323765
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014 - Snowbird, UT, United States
Duration: 22 Jun 201427 Jun 2014

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014
Country/TerritoryUnited States
CitySnowbird, UT
Period22/06/1427/06/14

Keywords

  • Graph
  • Graph algorithm
  • Graph enumeration
  • Subgraph listing

Fingerprint

Dive into the research topics of 'Parallel subgraph listing in a large-scale graph'. Together they form a unique fingerprint.

Cite this