TY - GEN
T1 - NN-Denoising
T2 - 32nd International Conference on Artificial Neural Networks, ICANN 2023
AU - Pan, Mengting
AU - Wang, Ye
AU - Chen, Zhiyun
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - The task of document-level relation extraction (DocRE) is crucial in the field of natural language processing, as it aims to extract semantic relations between entities in a given document to facilitate deeper comprehension. Previous methods have primarily focused on fully supervised learning for DocRE, which requires a large amount of human-annotated training data, making it a tedious and laborious task. Recently, more and more attention has been paid to the incomplete labeling problem in human-annotated data, and it is believed to be the bottleneck of model performance. To address this limitation and mitigate annotation costs, we propose a low-noise distant supervision scheme for DocRE, called NN-Denoising, that combines natural language inference (NLI) models and negative sampling to filter out noise in the training data. The NLI model serves as a pre-filter for denoising the distant supervision (DS) labels, while negative sampling is employed to overcome the false negative problem in the filtered data. Our experimental results on a large-scale DocRE benchmark demonstrate the superiority of the proposed approach over existing baselines in distant supervision learning. Specifically, NN-Denoising achieves an improvement of 15.83 F1 points and 10.34 F1 points compared to the ATLOP and SSR-PU models, respectively.
AB - The task of document-level relation extraction (DocRE) is crucial in the field of natural language processing, as it aims to extract semantic relations between entities in a given document to facilitate deeper comprehension. Previous methods have primarily focused on fully supervised learning for DocRE, which requires a large amount of human-annotated training data, making it a tedious and laborious task. Recently, more and more attention has been paid to the incomplete labeling problem in human-annotated data, and it is believed to be the bottleneck of model performance. To address this limitation and mitigate annotation costs, we propose a low-noise distant supervision scheme for DocRE, called NN-Denoising, that combines natural language inference (NLI) models and negative sampling to filter out noise in the training data. The NLI model serves as a pre-filter for denoising the distant supervision (DS) labels, while negative sampling is employed to overcome the false negative problem in the filtered data. Our experimental results on a large-scale DocRE benchmark demonstrate the superiority of the proposed approach over existing baselines in distant supervision learning. Specifically, NN-Denoising achieves an improvement of 15.83 F1 points and 10.34 F1 points compared to the ATLOP and SSR-PU models, respectively.
KW - Distantly Supervised Learning
KW - Document-level Relation Extraction
KW - Low-Noise
UR - https://www.scopus.com/pages/publications/85174635732
U2 - 10.1007/978-3-031-44213-1_42
DO - 10.1007/978-3-031-44213-1_42
M3 - 会议稿件
AN - SCOPUS:85174635732
SN - 9783031442124
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 503
EP - 515
BT - Artificial Neural Networks and Machine Learning – ICANN 2023 - 32nd International Conference on Artificial Neural Networks, Proceedings
A2 - Iliadis, Lazaros
A2 - Papaleonidas, Antonios
A2 - Angelov, Plamen
A2 - Jayne, Chrisina
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 26 September 2023 through 29 September 2023
ER -