TY - JOUR
T1 - Efficient Exact Subgraph Matching via GNN-based Path Dominance Embedding
AU - Ye, Yutong
AU - Lian, Xiang
AU - Chen, Mingsong
N1 - Publisher Copyright:
© 2024, VLDB Endowment. All rights reserved.
PY - 2024
Y1 - 2024
N2 - The classic problem of exact subgraph matching returns those subgraphs in a large-scale data graph that are isomorphic to a given query graph, which has gained increasing importance in many real-world applications such as social network analysis, knowledge graph discovery in the Semantic Web, bibliographical network mining, and so on. In this paper, we propose a novel and effective graph neural network (GNN)-based path embedding framework (GNN-PE), which allows efficient exact subgraph matching without introducing false dismissals. Unlike traditional GNN-based graph embeddings that only produce approximate subgraph matching results, in this paper, we carefully devise GNN-based embeddings for paths, such that: if two paths (and 1-hop neighbors of vertices on them) have the subgraph relationship, their corresponding GNN-based embedding vectors will strictly follow the dominance relationship. With such a newly designed property of path dominance embeddings, we are able to propose effective pruning strategies based on path label/dominance embeddings and guarantee no false dismissals for subgraph matching. We build multidimensional indexes over path embedding vectors, and develop an efficient subgraph matching algorithm by traversing indexes over graph partitions in parallel and applying our pruning methods. We also propose a cost-model-based query plan that obtains query paths from the query graph with low query cost. Through extensive experiments, we confirm the efficiency and effectiveness of our proposed GNN-PE approach for exact subgraph matching on both real and synthetic graph data.
AB - The classic problem of exact subgraph matching returns those subgraphs in a large-scale data graph that are isomorphic to a given query graph, which has gained increasing importance in many real-world applications such as social network analysis, knowledge graph discovery in the Semantic Web, bibliographical network mining, and so on. In this paper, we propose a novel and effective graph neural network (GNN)-based path embedding framework (GNN-PE), which allows efficient exact subgraph matching without introducing false dismissals. Unlike traditional GNN-based graph embeddings that only produce approximate subgraph matching results, in this paper, we carefully devise GNN-based embeddings for paths, such that: if two paths (and 1-hop neighbors of vertices on them) have the subgraph relationship, their corresponding GNN-based embedding vectors will strictly follow the dominance relationship. With such a newly designed property of path dominance embeddings, we are able to propose effective pruning strategies based on path label/dominance embeddings and guarantee no false dismissals for subgraph matching. We build multidimensional indexes over path embedding vectors, and develop an efficient subgraph matching algorithm by traversing indexes over graph partitions in parallel and applying our pruning methods. We also propose a cost-model-based query plan that obtains query paths from the query graph with low query cost. Through extensive experiments, we confirm the efficiency and effectiveness of our proposed GNN-PE approach for exact subgraph matching on both real and synthetic graph data.
UR - https://www.scopus.com/pages/publications/85195663725
U2 - 10.14778/3654621.3654630
DO - 10.14778/3654621.3654630
M3 - 会议文章
AN - SCOPUS:85195663725
SN - 2150-8097
VL - 17
SP - 1628
EP - 1641
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 7
T2 - 50th International Conference on Very Large Data Bases, VLDB 2024
Y2 - 24 August 2024 through 29 August 2024
ER -