On disambiguating authors: Collaboration network reconstruction in a bottom-up manner

Na Li, Renyu Zhu, Xiaoxu Zhou, Xiangnan He, Wenyuan Cai, Ming Gao, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Author disambiguation arises when different authors share the same name, which is a critical task in digital libraries, such as DBLP, CiteULike, CiteSeerX, etc. While the state-of-the-art methods have developed various paper embedding-based methods performing in a top-down manner, they primarily focus on the ego-network of a target name and overlook the low-quality collaborative relations existed in the ego-network. Thus, these methods can be suboptimal for disambiguating authors.In this paper, we model the author disambiguation as a collaboration network reconstruction problem, and propose an incremental and unsupervised author disambiguation method, namely IUAD, which performs in a bottom-up manner. Initially, we build a stable collaboration network based on stable collaborative relations. To further improve the recall, we build a probabilistic generative model to reconstruct the complete collaboration network. In addition, for newly published papers, we can incrementally judge who publish them via only computing the posterior probabilities. We have conducted extensive experiments on a large-scale DBLP dataset to evaluate IUAD. The experimental results demonstrate that IUAD not only achieves the promising performance, but also outperforms comparable baselines significantly. Codes are available at https://github.com/papergitgit/IUAD.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
PublisherIEEE Computer Society
Pages888-899
Number of pages12
ISBN (Electronic)9781728191843
DOIs
StatePublished - Apr 2021
Event37th IEEE International Conference on Data Engineering, ICDE 2021 - Virtual, Chania, Greece
Duration: 19 Apr 202122 Apr 2021

Publication series

NameProceedings - International Conference on Data Engineering
Volume2021-April
ISSN (Print)1084-4627
ISSN (Electronic)2375-0286

Conference

Conference37th IEEE International Conference on Data Engineering, ICDE 2021
Country/TerritoryGreece
CityVirtual, Chania
Period19/04/2122/04/21

Keywords

  • Author Disambiguation
  • Collaboration Network
  • Exponential Family
  • Probabilistic Generative Model

Fingerprint

Dive into the research topics of 'On disambiguating authors: Collaboration network reconstruction in a bottom-up manner'. Together they form a unique fingerprint.

Cite this