TY - GEN
T1 - Evaluating the Performance of Complex Text Generated by Large Language Models
AU - Bi, Fenglin
AU - Wang, Yantong
AU - Han, Fanyu
AU - Li, Zhi
AU - Hu, Tao
AU - Zhang, Yanbin
AU - Wang, Wei
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - The rapid advancement of Large Language Models (LLMs) has significantly enhanced text generation quality in Natural Language Processing (NLP). However, practical applications impose complex requirements, particularly in fields such as generating analysis reports. This study systematically reviews evaluation methods for LLMs and proposes a general framework for complex text generation, encompassing Generation Content, Prompt Dimension, Retrieval-Augmented Generation (RAG), and LLM Fine-tuning. We first introduce the Complex Text Generation Task Evaluation Paradigm. Based on this paradigm, we identify 15 sub-indicators with corresponding evaluation methods to comprehensively assess and improve LLM performance. Our research fills gaps in existing evaluation systems and provides a scalable framework for future studies, enhancing the applicability and impact of LLMs across various domains.
AB - The rapid advancement of Large Language Models (LLMs) has significantly enhanced text generation quality in Natural Language Processing (NLP). However, practical applications impose complex requirements, particularly in fields such as generating analysis reports. This study systematically reviews evaluation methods for LLMs and proposes a general framework for complex text generation, encompassing Generation Content, Prompt Dimension, Retrieval-Augmented Generation (RAG), and LLM Fine-tuning. We first introduce the Complex Text Generation Task Evaluation Paradigm. Based on this paradigm, we identify 15 sub-indicators with corresponding evaluation methods to comprehensively assess and improve LLM performance. Our research fills gaps in existing evaluation systems and provides a scalable framework for future studies, enhancing the applicability and impact of LLMs across various domains.
KW - Complex Text Generation
KW - Controllable Text Generation
KW - Large Language Models
KW - Natural Language Processing
UR - https://www.scopus.com/pages/publications/105006928487
U2 - 10.1007/978-981-96-6310-1_11
DO - 10.1007/978-981-96-6310-1_11
M3 - 会议稿件
AN - SCOPUS:105006928487
SN - 9789819663095
T3 - Communications in Computer and Information Science
SP - 151
EP - 167
BT - Intelligent Computers, Algorithms, and Applications - 4th BenchCouncil International Symposium, IC 2024, Revised Selected Papers
A2 - Luo, Chunjie
A2 - Li, Weiping
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th BenchCouncil International Symposium on Intelligent Computers, Algorithms, and Applications, IC 2024
Y2 - 4 December 2024 through 6 December 2024
ER -