TY - GEN
T1 - Enhancing Small Model Performance in Educational Classification Tasks through Knowledge Distillation
AU - Xu, Haoxin
AU - Qi, Changyong
AU - Jiang, Bingqian
AU - Liu, Tong
AU - Zheng, Longwei
AU - Gu, Xiaoqing
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - As the demand for precision, efficiency, and low-cost solutions in educational classification tasks continues to grow, enhancing model performance has become a critical focus of research. While large language models excel in these tasks, their high cost and resource requirements limit widespread application. This study proposes a Knowledge-Enhanced Distillation (KED) method, utilizing ChatGPT-4, ChatGPT-4o, and Llama3 as teacher models, and three different sizes of BERT models as student models. The method was validated across three real-world educational datasets. The results demonstrate that the KED method significantly improves the accuracy and F1 scores of small models in educational text classification tasks, while also substantially reducing computational costs and resource consumption. Notably, the KED method shows exceptional performance in scenarios involving few-shot learning and class imbalance. The innovation of this study lies in applying the KED method to educational classification tasks, filling a gap in current research and highlighting its significant potential for practical application in educational contexts.
AB - As the demand for precision, efficiency, and low-cost solutions in educational classification tasks continues to grow, enhancing model performance has become a critical focus of research. While large language models excel in these tasks, their high cost and resource requirements limit widespread application. This study proposes a Knowledge-Enhanced Distillation (KED) method, utilizing ChatGPT-4, ChatGPT-4o, and Llama3 as teacher models, and three different sizes of BERT models as student models. The method was validated across three real-world educational datasets. The results demonstrate that the KED method significantly improves the accuracy and F1 scores of small models in educational text classification tasks, while also substantially reducing computational costs and resource consumption. Notably, the KED method shows exceptional performance in scenarios involving few-shot learning and class imbalance. The innovation of this study lies in applying the KED method to educational classification tasks, filling a gap in current research and highlighting its significant potential for practical application in educational contexts.
KW - Educational Application
KW - Few-Shot Learning
KW - Knowledge Distillation
KW - Large Language Model
UR - https://www.scopus.com/pages/publications/105003891498
U2 - 10.1109/ICASSP49660.2025.10888451
DO - 10.1109/ICASSP49660.2025.10888451
M3 - 会议稿件
AN - SCOPUS:105003891498
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
BT - 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
A2 - Rao, Bhaskar D
A2 - Trancoso, Isabel
A2 - Sharma, Gaurav
A2 - Mehta, Neelesh B.
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Y2 - 6 April 2025 through 11 April 2025
ER -