Abstract
[Objective] This study investigates the performance differences among existing large language models (LLMs) in extracting entities and relations of Chinese medical text, and analyzes the influence of the number of examples and relation types on the extraction performance. [Methods] Based on prompt engineering approach, we use the API way to call 9 mainstream LLMs, modifying prompt from two perspectives: the number of examples and the number of relation types. Experiments are conducted using CMeIE-V2 dataset to compare extraction performance. [Results] (Ⅰ) The comprehensive extraction ability of GLM-4-0520 is in the first place, with F1 scores of 0.4422, 0.3869, and 0.3874 when extracting three relation types of “clinical manifestation”, “medication”, and“etiology”respectively. (Ⅱ) When varying the number of examples m in the prompt, the F1 score initially increases with m, and reaches a maximum score of 0.4742 when m = 8, but it declines after m > 8. (Ⅲ) After increasing the number of relation types to be extracted, n, the F1 score drops significantly: when n = 2, the F1 score decreases by 0.1182 compared to n = 1, and when n = 10, the F1 score is only 0.2949. [Limitations] Currently, there are few public datasets available, so the experimental results are based on a single dataset. Additionally, since medical-domain LLMs are difficult to access via API, all models used in this study are from general domain. [Conclusions] The extraction performance varies greatly among different LLMs; A suitable number of examples can improve the extraction performance, but more is not always better; LLM is not good at extracting multiple relation types at the same time.
| Translated title of the contribution | Entity Relation Extraction of Chinese Medical Text Based on Large Language Model and Prompt Engineering |
|---|---|
| Original language | Chinese (Traditional) |
| Pages (from-to) | 25-36 |
| Number of pages | 12 |
| Journal | Data Analysis and Knowledge Discovery |
| Volume | 9 |
| Issue number | 9 |
| DOIs | |
| State | Published - 25 Sep 2025 |