TY - JOUR
T1 - Improving Chinese Named Entity Recognition by Large-Scale Syntactic Dependency Graph
AU - Zhu, Peng
AU - Cheng, Dawei
AU - Yang, Fangzhou
AU - Luo, Yifeng
AU - Huang, Dingjiang
AU - Qian, Weining
AU - Zhou, Aoying
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2022
Y1 - 2022
N2 - Named entity recognition (NER) isa preliminary task in natural language processing (NLP). Recognizing Chinese named entities from unstructured texts is challenging due to the lack of word boundaries. Even if performing Chinese Word Segmentation (CWS) could help to determine word boundaries, it is still difficult to determine which words should be clustered together for entity identification, since entities are often composed of multiple-segmented words. As dependency relationships between segmented words could help to determine entity boundaries, it is crucial to employ information related to syntactic dependency relationships to improve NER performance. In this paper, we propose a novel NER model to learn information about syntactic dependency graphs with graph neural networks, and merge learned information into the classic Bidirectional Long Short-Term Memory (BiLSTM) - Conditional Random Field (CRF) NER scheme. In addition, we extract various kinds of task-specific hidden information from multiple CWS and part-of-speech (POS) tagging tasks, to further improve the NER model. We finally leverage multiple self-attention components to integrate multiple kinds of extracted information for named entity identification. Experimental results on three public benchmark datasets show that our model outperforms the state-of-the-art baselines in most scenarios.
AB - Named entity recognition (NER) isa preliminary task in natural language processing (NLP). Recognizing Chinese named entities from unstructured texts is challenging due to the lack of word boundaries. Even if performing Chinese Word Segmentation (CWS) could help to determine word boundaries, it is still difficult to determine which words should be clustered together for entity identification, since entities are often composed of multiple-segmented words. As dependency relationships between segmented words could help to determine entity boundaries, it is crucial to employ information related to syntactic dependency relationships to improve NER performance. In this paper, we propose a novel NER model to learn information about syntactic dependency graphs with graph neural networks, and merge learned information into the classic Bidirectional Long Short-Term Memory (BiLSTM) - Conditional Random Field (CRF) NER scheme. In addition, we extract various kinds of task-specific hidden information from multiple CWS and part-of-speech (POS) tagging tasks, to further improve the NER model. We finally leverage multiple self-attention components to integrate multiple kinds of extracted information for named entity identification. Experimental results on three public benchmark datasets show that our model outperforms the state-of-the-art baselines in most scenarios.
KW - Graph neural network
KW - Multi-task learning
KW - Name entity recognition
KW - Self-attention
KW - Syntactic dependency graph
UR - https://www.scopus.com/pages/publications/85125695493
U2 - 10.1109/TASLP.2022.3153261
DO - 10.1109/TASLP.2022.3153261
M3 - 文章
AN - SCOPUS:85125695493
SN - 2329-9290
VL - 30
SP - 979
EP - 991
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -