TY - GEN
T1 - A Sketch Propagation Framework for Hub Queries on Unmaterialized Relational Graphs
AU - Niu, Yudong
AU - Li, Yuchen
AU - Karras, Panagiotis
AU - Wang, Yanhao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Relational graphs encapsulate nontrivial inherent interactions among entities in heterogeneous data sources. Iden-tifying hubs in relational graphs is vital in various applications such as fraud detection, influence analysis, and protein complex discovery. However, building relational graphs induced by meta-paths on heterogeneous data entails substantial costs, thus hin-dering efficient hub discovery. In this paper, we propose a novel sketch propagation framework for approximate hub queries in induced relational graphs that avoids explicitly materializing those graphs. Our framework specifically supports hub queries that ask for all nodes whose centrality scores, based on degree or h-index, are in the top quantile with provable guarantees under the notion of -separable sets. In addition, we devise pruning techniques that efficiently process personalized hub queries asking whether a given node is a hub. Extensive experiments on real-world and synthetic data confirm the efficacy and efficiency of our proposals, which achieve orders of magnitude speed-ups over exact methods while consistently attaining accuracy beyond 90%.
AB - Relational graphs encapsulate nontrivial inherent interactions among entities in heterogeneous data sources. Iden-tifying hubs in relational graphs is vital in various applications such as fraud detection, influence analysis, and protein complex discovery. However, building relational graphs induced by meta-paths on heterogeneous data entails substantial costs, thus hin-dering efficient hub discovery. In this paper, we propose a novel sketch propagation framework for approximate hub queries in induced relational graphs that avoids explicitly materializing those graphs. Our framework specifically supports hub queries that ask for all nodes whose centrality scores, based on degree or h-index, are in the top quantile with provable guarantees under the notion of -separable sets. In addition, we devise pruning techniques that efficiently process personalized hub queries asking whether a given node is a hub. Extensive experiments on real-world and synthetic data confirm the efficacy and efficiency of our proposals, which achieve orders of magnitude speed-ups over exact methods while consistently attaining accuracy beyond 90%.
KW - degree centrality
KW - h-index
KW - hub query
KW - KMV sketch
KW - relational graph
UR - https://www.scopus.com/pages/publications/105015527307
U2 - 10.1109/ICDE65448.2025.00225
DO - 10.1109/ICDE65448.2025.00225
M3 - 会议稿件
AN - SCOPUS:105015527307
T3 - Proceedings - International Conference on Data Engineering
SP - 3003
EP - 3016
BT - Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
PB - IEEE Computer Society
T2 - 41st IEEE International Conference on Data Engineering, ICDE 2025
Y2 - 19 May 2025 through 23 May 2025
ER -