TY - JOUR
T1 - Machine translationese of large language models
T2 - Dependency triplets, text classification, and SHAP analysis
AU - Zhang, Shukang
AU - Zhao, Chaoyong
N1 - Publisher Copyright:
© 2026 Zhang, Zhao. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2026/1
Y1 - 2026/1
N2 - This study addresses the challenge of distinguishing human translations from those generated by Large Language Models (LLMs) by utilizing dependency triplet features and evaluating 16 machine learning classifiers. Using 10-fold cross-validation, the SVM model achieves the highest mean F1-score of 93%, while all other classifiers consistently differentiate between human and machine translations. SHAP analysis helps identify key dependency features that distinguish human and machine translations, improving our understanding of how LLMs produce translationese. The findings provide practical insights for enhancing translation quality assessment and refining translation models across various languages and text genres, contributing to the advancement of natural language processing techniques.
AB - This study addresses the challenge of distinguishing human translations from those generated by Large Language Models (LLMs) by utilizing dependency triplet features and evaluating 16 machine learning classifiers. Using 10-fold cross-validation, the SVM model achieves the highest mean F1-score of 93%, while all other classifiers consistently differentiate between human and machine translations. SHAP analysis helps identify key dependency features that distinguish human and machine translations, improving our understanding of how LLMs produce translationese. The findings provide practical insights for enhancing translation quality assessment and refining translation models across various languages and text genres, contributing to the advancement of natural language processing techniques.
UR - https://www.scopus.com/pages/publications/105027047114
U2 - 10.1371/journal.pone.0339769
DO - 10.1371/journal.pone.0339769
M3 - 文章
C2 - 41511938
AN - SCOPUS:105027047114
SN - 1932-6203
VL - 21
JO - PLoS ONE
JF - PLoS ONE
IS - 1 January
M1 - e0339769
ER -