TY - JOUR
T1 - 多神经网络协作的军事领域命名实体识别
AU - Yin, Xuezhen
AU - Zhao, Hui
AU - Zhao, Junbao
AU - Yao, Wanwei
AU - Huang, Zelin
N1 - Publisher Copyright:
© 2020, Tsinghua University Press. All right reserved.
PY - 2020/8/1
Y1 - 2020/8/1
N2 - Web data contains a large amount of high-value military information which has become an important data source for open-source military intelligence. Military named entity recognition is a basic, key task for information extraction, question answering and knowledge graphs in the military domain. Military named entity recognition faces some unique challenges not seen in searches for named entities in other domains, such as military named entity boundaries being vague and difficult to define, lack of standardized military terms in Internet media, extensive use of abbreviations, and the lack of a public military-oriented corpus. This paper presents an entity labeling strategy that includes the effects of fuzzy entity boundaries and a military-oriented corpus called MilitaryCorpus based on microblog data constructed by combining domain expert knowledge. A multi-neural network collaboration approach is then developed based on a named entity recognition model. The character level features are learned in the BERT (bidirectional encoder representations from transformers)-based Chinese character embedding representation layer with the context features extracted in the BiLSTM (bi-directional long short-term memory) neural network layer to form the feature matrix. Finally, the optimal tag sequence is generated in the CRF (conditional random field) layer. Tests show that the recall rate and the F-score of the BERT-BiLSTM-CRF model are 28.48% and 18.65% higher than those of a CRF-based entity recognition model, 13.91% and 8.69% higher than those of a BiLSTM-CRF-based entity recognition model, and 7.08% and 5.15% higher than those of a CNN (convolutional neural networks)-BiLSTM-CRF-based model.
AB - Web data contains a large amount of high-value military information which has become an important data source for open-source military intelligence. Military named entity recognition is a basic, key task for information extraction, question answering and knowledge graphs in the military domain. Military named entity recognition faces some unique challenges not seen in searches for named entities in other domains, such as military named entity boundaries being vague and difficult to define, lack of standardized military terms in Internet media, extensive use of abbreviations, and the lack of a public military-oriented corpus. This paper presents an entity labeling strategy that includes the effects of fuzzy entity boundaries and a military-oriented corpus called MilitaryCorpus based on microblog data constructed by combining domain expert knowledge. A multi-neural network collaboration approach is then developed based on a named entity recognition model. The character level features are learned in the BERT (bidirectional encoder representations from transformers)-based Chinese character embedding representation layer with the context features extracted in the BiLSTM (bi-directional long short-term memory) neural network layer to form the feature matrix. Finally, the optimal tag sequence is generated in the CRF (conditional random field) layer. Tests show that the recall rate and the F-score of the BERT-BiLSTM-CRF model are 28.48% and 18.65% higher than those of a CRF-based entity recognition model, 13.91% and 8.69% higher than those of a BiLSTM-CRF-based entity recognition model, and 7.08% and 5.15% higher than those of a CNN (convolutional neural networks)-BiLSTM-CRF-based model.
KW - Bidirectional encoder representations from transformers (BERT)
KW - Fuzzy boundary
KW - Military named entity recognition
KW - Multi-neural network
UR - https://www.scopus.com/pages/publications/85086992874
U2 - 10.16511/j.cnki.qhdxxb.2020.25.004
DO - 10.16511/j.cnki.qhdxxb.2020.25.004
M3 - 文章
AN - SCOPUS:85086992874
SN - 1000-0054
VL - 60
SP - 648
EP - 655
JO - Qinghua Daxue Xuebao/Journal of Tsinghua University
JF - Qinghua Daxue Xuebao/Journal of Tsinghua University
IS - 8
ER -