Skip to main navigation Skip to search Skip to main content

大模型提示工程能够替代经典深度学习模型吗?——基于医学文本实体关系抽取任务的对比研究

Translated title of the contribution: Can Large Language Model and Prompt Engineering Replace Classical Deep Learning Model? A Comparative Study Based on Medical Text Entity Relation Extraction Task
  • Yufeng Duan*
  • , Jiahong Xie
  • , Ping Bai
  • , Tianyang Gong
  • *Corresponding author for this work
  • East China Normal University

Research output: Contribution to journalArticlepeer-review

Abstract

[Objective] To explore whether large language models (LLMs) and prompt engineering can replace classical deep learning models in the task of entity relation extraction from Chinese medical texts with high professionalism and domain characteristics. [Methods] This study uses three LLMs (GLM-4, ERNIE-4-Turbo, and DeepSeek-R1), and three classical deep learning models (CBLUE, CasRel, and GPLinker), to systematically compare the performance differences between LLMs based on prompt engineering and classical deep learning models. The comparison is conducted by varying the number of relation types to be extracted, the number of examples in the prompt for LLMs, and the training data size for classical deep learning models. We use BERT-Base and RoBERTa as encoders for classical deep learning models. [Results] Experimental results on the CMeIEV2 dataset show that: (I) RoBERTa-CBLUE and RoBERTa-GPLinker achieve the best extraction results. When extracting one relation type, the F1 score reaches 0.5826 and 0.5853, and when extracting ten relation types, the F1 score is 0.5112 and 0.4934; (II) LLMs are not good at extracting multiple relation types simultaneously. When extracting two relation types, the F1 score of GLM-4, ERNIE-4-Turbo, and DeepSeek-R1 decrease by 0.1182, 0.0885, and 0.1310, respectively, compared to extracting one relation type; (III) adding examples to the prompt can improve the extraction performance of LLMs, but adding more examples does not necessarily lead to better results. [Limitations] This study is based on a single dataset, and future work could extend the experiments to datasets from other domains. [Conclusions] The prompt engineering approach for LLMs is currently difficult to replace classical deep learning models and can only be considered as an alternative when labeled samples are limited.

Translated title of the contributionCan Large Language Model and Prompt Engineering Replace Classical Deep Learning Model? A Comparative Study Based on Medical Text Entity Relation Extraction Task
Original languageChinese (Traditional)
Pages (from-to)61-75
Number of pages15
JournalData Analysis and Knowledge Discovery
Volume10
Issue number1
DOIs
StatePublished - 25 Jan 2026

Fingerprint

Dive into the research topics of 'Can Large Language Model and Prompt Engineering Replace Classical Deep Learning Model? A Comparative Study Based on Medical Text Entity Relation Extraction Task'. Together they form a unique fingerprint.

Cite this