TY - GEN
T1 - An NER-based product identification and lucene-based product linking approach to CPROD1 challenge
T2 - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
AU - Toh, Zhiqiang
AU - Wang, Wenting
AU - Lan, Man
AU - Li, Xiaoli
PY - 2012
Y1 - 2012
N2 - This paper presents our methodology for CPROD1 Challenge, which is to identify the product mentions from text and then link the product to the entries in the catalog file. Our solution follows 2 steps. First, we use processing pipelines to extract product mentions by incorporating multiple techniques including traditional named entities recognition (NER), regular expression rules and gazetteer-based string matching. Second, we view product linking task into an information retrieval (IR) problem, where the description catalog file is populated into a database. Thus, each product mention acts as a search query and the returned results from catalog entry database serve as the links. The F1 scores of our submission on public and private test data are 24.82% and 16.04%, respectively.
AB - This paper presents our methodology for CPROD1 Challenge, which is to identify the product mentions from text and then link the product to the entries in the catalog file. Our solution follows 2 steps. First, we use processing pipelines to extract product mentions by incorporating multiple techniques including traditional named entities recognition (NER), regular expression rules and gazetteer-based string matching. Second, we view product linking task into an information retrieval (IR) problem, where the description catalog file is populated into a database. Thus, each product mention acts as a search query and the returned results from catalog entry database serve as the links. The F1 scores of our submission on public and private test data are 24.82% and 16.04%, respectively.
KW - Named entity recognition
KW - Product disambiguation
KW - Product identification
KW - Product linking
UR - https://www.scopus.com/pages/publications/84873185336
U2 - 10.1109/ICDMW.2012.66
DO - 10.1109/ICDMW.2012.66
M3 - 会议稿件
AN - SCOPUS:84873185336
SN - 9780769549255
T3 - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
SP - 869
EP - 871
BT - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Y2 - 10 December 2012 through 10 December 2012
ER -