An NER-based product identification and lucene-based product linking approach to CPROD1 challenge: Description of submission system to CPROD1 Challenge

Zhiqiang Toh*, Wenting Wang, Man Lan, Xiaoli Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper presents our methodology for CPROD1 Challenge, which is to identify the product mentions from text and then link the product to the entries in the catalog file. Our solution follows 2 steps. First, we use processing pipelines to extract product mentions by incorporating multiple techniques including traditional named entities recognition (NER), regular expression rules and gazetteer-based string matching. Second, we view product linking task into an information retrieval (IR) problem, where the description catalog file is populated into a database. Thus, each product mention acts as a search query and the returned results from catalog entry database serve as the links. The F1 scores of our submission on public and private test data are 24.82% and 16.04%, respectively.

Original languageEnglish
Title of host publicationProceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Pages869-871
Number of pages3
DOIs
StatePublished - 2012
Externally publishedYes
Event12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 - Brussels, Belgium
Duration: 10 Dec 201210 Dec 2012

Publication series

NameProceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012

Conference

Conference12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Country/TerritoryBelgium
CityBrussels
Period10/12/1210/12/12

Keywords

  • Named entity recognition
  • Product disambiguation
  • Product identification
  • Product linking

Fingerprint

Dive into the research topics of 'An NER-based product identification and lucene-based product linking approach to CPROD1 challenge: Description of submission system to CPROD1 Challenge'. Together they form a unique fingerprint.

Cite this