TY - JOUR
T1 - Metacognitive symbolic distillation framework for multi-choice machine reading comprehension
AU - Yao, Jiacheng
AU - Xu, Xin
AU - He, Guoxiu
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/3/15
Y1 - 2025/3/15
N2 - Symbolic knowledge distillation can transfer the reasoning abilities of large language models (LLMs) effectively to smaller models. However, in the context of multi-choice machine reading comprehension (MMRC), traditional distillation methods focus primarily on learning from the rationales of the correct options generated by the large teacher model, overlooking the educational significance of reasoning behind incorrect options. In human education, metacognition emphasizes the importance of actively identifying errors, enhancing the overall understanding. Inspired by this approach, we propose an innovative framework that incorporates metacognition into symbolic distillation. Initially, we prompt the teacher LLM to generate rationales for all options in the MMRC dataset. Subsequently, the small student model is fine-tuned using these rationales, including those for incorrect options. Our experiments on two MMRC datasets demonstrate that this framework improves the performance of the small student model significantly compared to standard fine-tuned and distilled models. We further find that when the student model is sufficiently large, upgrading the teacher model could yield further improvements. However, the effectiveness of our framework is constrained by the performance of the teacher model on more complex MMRC tasks.
AB - Symbolic knowledge distillation can transfer the reasoning abilities of large language models (LLMs) effectively to smaller models. However, in the context of multi-choice machine reading comprehension (MMRC), traditional distillation methods focus primarily on learning from the rationales of the correct options generated by the large teacher model, overlooking the educational significance of reasoning behind incorrect options. In human education, metacognition emphasizes the importance of actively identifying errors, enhancing the overall understanding. Inspired by this approach, we propose an innovative framework that incorporates metacognition into symbolic distillation. Initially, we prompt the teacher LLM to generate rationales for all options in the MMRC dataset. Subsequently, the small student model is fine-tuned using these rationales, including those for incorrect options. Our experiments on two MMRC datasets demonstrate that this framework improves the performance of the small student model significantly compared to standard fine-tuned and distilled models. We further find that when the student model is sufficiently large, upgrading the teacher model could yield further improvements. However, the effectiveness of our framework is constrained by the performance of the teacher model on more complex MMRC tasks.
KW - Knowledge distillation
KW - Large language model
KW - Metacognition
KW - Multi-choice machine reading comprehension
KW - Symbolic distillation
UR - https://www.scopus.com/pages/publications/85217905443
U2 - 10.1016/j.knosys.2025.113130
DO - 10.1016/j.knosys.2025.113130
M3 - 文章
AN - SCOPUS:85217905443
SN - 0950-7051
VL - 312
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 113130
ER -