Meta structure: Computing relevance in large heterogeneous information networks

  • Zhipeng Huang
  • , Yudian Zheng
  • , Reynold Cheng
  • , Yizhou Sun
  • , Nikos Mamoulis
  • , Xiang Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

205 Scopus citations

Abstract

A heterogeneous information network (HIN) is a graph model in which objects and edges are annotated with types. Large and complex databases, such as YAGO and DBLP, can be modeled as HINs. A fundamental problem in HINs is the computation of closeness, or relevance, between two HIN objects. Relevance measures can be used in various applications, including entity resolution, recommendation, and information retrieval. Several studies have investigated the use of HIN information for relevance computation, however, most of them only utilize simple structure, such as path, to measure the similarity between objects. In this paper, we propose to use meta structure, which is a directed acyclic graph of object types with edge types connecting in between, to measure the proximity between objects. The strength of meta structure is that it can describe complex relationship between two HIN objects (e.g., two papers in DBLP share the same authors and topics). We develop three relevance measures based on meta structure. Due to the computational complexity of these measures, we further design an algorithm with data structures proposed to support their evaluation. Our extensive experiments on YAGO and DBLP show that meta structure-based relevance is more effective than state-of-the-art approaches, and can be efficiently computed.

Original languageEnglish
Title of host publicationKDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1595-1604
Number of pages10
ISBN (Electronic)9781450342322
DOIs
StatePublished - 13 Aug 2016
Externally publishedYes
Event22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016 - San Francisco, United States
Duration: 13 Aug 201617 Aug 2016

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume13-17-August-2016

Conference

Conference22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
Country/TerritoryUnited States
CitySan Francisco
Period13/08/1617/08/16

Fingerprint

Dive into the research topics of 'Meta structure: Computing relevance in large heterogeneous information networks'. Together they form a unique fingerprint.

Cite this