OnPerDis: Ontology-based personal name disambiguation on the Web

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

With the growth of web documents, the ambiguity of personal name becomes more common and brings poor performance of web search. Identifying a correct personal entity from the a piece of or the whole document is still a very challenging problem, especially for Chinese websites. In this paper, we propose a novel Ontology-based approach for Personal Name Disambiguation (named "OnPerDis"). This approach has two main steps: first, we construct person ontology (PO) with rich conceptual modeling as well as a large set of supporting instances; second, for a given personal name on the web, we create a temporary instance and extract features from the web documents, calculate the similarity between this temporary instance and the instances in the PO. The one with the highest similarity score is chosen as the appropriate personal name. Our extensive evaluations with two rich real-life datasets (CIPS-SIGHAN 2012 NERD and Chinese web documents) shows OnPerDis' efficacy on personal name disambiguation on the Web.

Original languageEnglish
Title of host publicationProceedings - 2013 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013
Pages185-192
Number of pages8
DOIs
StatePublished - 2013
Event2013 12th IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013 - Atlanta, GA, United States
Duration: 17 Nov 201320 Nov 2013

Publication series

NameProceedings - 2013 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013
Volume1

Conference

Conference2013 12th IEEE/WIC/ACM International Conference on Web Intelligence, WI 2013
Country/TerritoryUnited States
CityAtlanta, GA
Period17/11/1320/11/13

Keywords

  • Conceptual modeling
  • Instance matching
  • Ontology population
  • Personal name disambiguation

Fingerprint

Dive into the research topics of 'OnPerDis: Ontology-based personal name disambiguation on the Web'. Together they form a unique fingerprint.

Cite this