Semi-supervised clustering in attributed heterogeneous information networks

  • Xiang Li
  • , Yao Wu
  • , Martin Ester
  • , Ben Kao
  • , Xin Wang
  • , Yudian Zheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

78 Scopus citations

Abstract

A heterogeneous information network (HIN) is one whose nodes model objects of different types and whose links model objects’ relationships. In many applications, such as social networks and RDF-based knowledge bases, information can be modeled as HINs. To enrich its information content, objects (as represented by nodes) in an HIN are typically associated with additional attributes. We call such an HIN an Attributed HIN or AHIN. We study the problem of clustering objects in an AHIN, taking into account objects’ similarities with respect to both object attribute values and their structural connectedness in the network. We show how supervision signal, expressed in the form of a must-link set and a cannot-link set, can be leveraged to improve clustering results. We put forward the SCHAIN algorithm to solve the clustering problem. We conduct extensive experiments comparing SCHAIN with other state-of-the-art clustering algorithms and show that SCHAIN outperforms the others in clustering quality.

Original languageEnglish
Title of host publication26th International World Wide Web Conference, WWW 2017
PublisherInternational World Wide Web Conferences Steering Committee
Pages1621-1629
Number of pages9
ISBN (Print)9781450349130
DOIs
StatePublished - 2017
Externally publishedYes
Event26th International World Wide Web Conference, WWW 2017 - Perth, Australia
Duration: 3 Apr 20177 Apr 2017

Publication series

Name26th International World Wide Web Conference, WWW 2017

Conference

Conference26th International World Wide Web Conference, WWW 2017
Country/TerritoryAustralia
CityPerth
Period3/04/177/04/17

Keywords

  • Attributed heterogeneous information network
  • Network structure
  • Object attributes
  • Semi-supervised clustering

Fingerprint

Dive into the research topics of 'Semi-supervised clustering in attributed heterogeneous information networks'. Together they form a unique fingerprint.

Cite this