TY - GEN
T1 - Semi-supervised clustering in attributed heterogeneous information networks
AU - Li, Xiang
AU - Wu, Yao
AU - Ester, Martin
AU - Kao, Ben
AU - Wang, Xin
AU - Zheng, Yudian
N1 - Publisher Copyright:
© 2017 International World Wide Web Conference Committee (IW3C2).
PY - 2017
Y1 - 2017
N2 - A heterogeneous information network (HIN) is one whose nodes model objects of different types and whose links model objects’ relationships. In many applications, such as social networks and RDF-based knowledge bases, information can be modeled as HINs. To enrich its information content, objects (as represented by nodes) in an HIN are typically associated with additional attributes. We call such an HIN an Attributed HIN or AHIN. We study the problem of clustering objects in an AHIN, taking into account objects’ similarities with respect to both object attribute values and their structural connectedness in the network. We show how supervision signal, expressed in the form of a must-link set and a cannot-link set, can be leveraged to improve clustering results. We put forward the SCHAIN algorithm to solve the clustering problem. We conduct extensive experiments comparing SCHAIN with other state-of-the-art clustering algorithms and show that SCHAIN outperforms the others in clustering quality.
AB - A heterogeneous information network (HIN) is one whose nodes model objects of different types and whose links model objects’ relationships. In many applications, such as social networks and RDF-based knowledge bases, information can be modeled as HINs. To enrich its information content, objects (as represented by nodes) in an HIN are typically associated with additional attributes. We call such an HIN an Attributed HIN or AHIN. We study the problem of clustering objects in an AHIN, taking into account objects’ similarities with respect to both object attribute values and their structural connectedness in the network. We show how supervision signal, expressed in the form of a must-link set and a cannot-link set, can be leveraged to improve clustering results. We put forward the SCHAIN algorithm to solve the clustering problem. We conduct extensive experiments comparing SCHAIN with other state-of-the-art clustering algorithms and show that SCHAIN outperforms the others in clustering quality.
KW - Attributed heterogeneous information network
KW - Network structure
KW - Object attributes
KW - Semi-supervised clustering
UR - https://www.scopus.com/pages/publications/85046282482
U2 - 10.1145/3038912.3052576
DO - 10.1145/3038912.3052576
M3 - 会议稿件
AN - SCOPUS:85046282482
SN - 9781450349130
T3 - 26th International World Wide Web Conference, WWW 2017
SP - 1621
EP - 1629
BT - 26th International World Wide Web Conference, WWW 2017
PB - International World Wide Web Conferences Steering Committee
T2 - 26th International World Wide Web Conference, WWW 2017
Y2 - 3 April 2017 through 7 April 2017
ER -