跳到主要导航 跳到搜索 跳到主要内容

Uncertainty-Aware Self-training for Low-Resource Neural Sequence Labeling

  • East China Normal University
  • Alibaba Group Holding Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data privacy and computation efficiency issues. This paper presents SeqUST, a novel uncertain-aware self-training framework for NSL to address the labeled data scarcity issue and to effectively utilize unlabeled data. Specifically, we incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data based on the model confidence and certainty. A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels. In addition, we develop a Gaussian-based consistency regularization technique to further improve the model robustness on Gaussian-distributed perturbed representations. This effectively alleviates the over-fitting dilemma originating from pseudo-labeled augmented data. Extensive experiments over six benchmarks demonstrate that our SeqUST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios.

源语言英语
主期刊名AAAI-23 Technical Tracks 11
编辑Brian Williams, Yiling Chen, Jennifer Neville
出版商AAAI press
13682-13690
页数9
ISBN(电子版)9781577358800
DOI
出版状态已出版 - 27 6月 2023
活动37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, 美国
期限: 7 2月 202314 2月 2023

出版系列

姓名Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
37

会议

会议37th AAAI Conference on Artificial Intelligence, AAAI 2023
国家/地区美国
Washington
时期7/02/2314/02/23

指纹

探究 'Uncertainty-Aware Self-training for Low-Resource Neural Sequence Labeling' 的科研主题。它们共同构成独一无二的指纹。

引用此