TY - GEN
T1 - CAT-BERT
T2 - 26th International Conference on Database Systems for Advanced Applications, DASFAA 2021
AU - Chen, Cen
AU - Huang, Xinjing
AU - Ji, Feng
AU - Wang, Chengyu
AU - Qiu, Minghui
AU - Huang, Jun
AU - Zhang, Yin
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Machine Reading Comprehension (MRC) is an important NLP task with the goal of extracting answers to user questions from background passages. For conversational applications, modeling the contexts under the multi-turn setting is highly necessary for MRC, which has drawn great attention recently. Past studies on multi-turn MRC usually focus on a single domain, ignoring the fact that knowledge in different MRC tasks are transferable. To address this issue, we present a unified framework to model both single-turn and multi-turn MRC tasks which allows knowledge sharing from different source MRC tasks to help solve the target MRC task. Specifically, the Context-Aware Transferable Bidirectional Encoder Representations from Transformers (CAT-BERT) model is proposed, which jointly learns to solve both single-turn and multi-turn MRC tasks in a single pre-trained language model. In this model, both history questions and answers are encoded into the contexts for the multi-turn setting. To capture the task-level importance of different layer outputs, a task-specific attention layer is further added to the CAT-BERT outputs, reflecting the positions that the model should pay attention to for a specific MRC task. Extensive experimental results and ablation studies show that CAT-BERT achieves competitive results in multi-turn MRC tasks, outperforming strong baselines.
AB - Machine Reading Comprehension (MRC) is an important NLP task with the goal of extracting answers to user questions from background passages. For conversational applications, modeling the contexts under the multi-turn setting is highly necessary for MRC, which has drawn great attention recently. Past studies on multi-turn MRC usually focus on a single domain, ignoring the fact that knowledge in different MRC tasks are transferable. To address this issue, we present a unified framework to model both single-turn and multi-turn MRC tasks which allows knowledge sharing from different source MRC tasks to help solve the target MRC task. Specifically, the Context-Aware Transferable Bidirectional Encoder Representations from Transformers (CAT-BERT) model is proposed, which jointly learns to solve both single-turn and multi-turn MRC tasks in a single pre-trained language model. In this model, both history questions and answers are encoded into the contexts for the multi-turn setting. To capture the task-level importance of different layer outputs, a task-specific attention layer is further added to the CAT-BERT outputs, reflecting the positions that the model should pay attention to for a specific MRC task. Extensive experimental results and ablation studies show that CAT-BERT achieves competitive results in multi-turn MRC tasks, outperforming strong baselines.
KW - Machine reading comprehension
KW - Pre-trained language model
KW - Question answering
KW - Transfer learning
UR - https://www.scopus.com/pages/publications/85104701687
U2 - 10.1007/978-3-030-73197-7_10
DO - 10.1007/978-3-030-73197-7_10
M3 - 会议稿件
AN - SCOPUS:85104701687
SN - 9783030731960
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 152
EP - 167
BT - Database Systems for Advanced Applications - 26th International Conference, DASFAA 2021, Proceedings
A2 - Jensen, Christian S.
A2 - Lim, Ee-Peng
A2 - Yang, De-Nian
A2 - Chang, Chia-Hui
A2 - Xu, Jianliang
A2 - Peng, Wen-Chih
A2 - Huang, Jen-Wei
A2 - Shen, Chih-Ya
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 11 April 2021 through 14 April 2021
ER -