跳到主要导航 跳到搜索 跳到主要内容

VisCGEC: Benchmarking the Visual Chinese Grammatical Error Correction

  • Xiaoman Wang
  • , Dan Yuan
  • , Xin Liu
  • , Yike Zhao
  • , Xiaoxiao Zhang
  • , Xizhi Chen
  • , Yunshi Lan*
  • *此作品的通讯作者
  • East China Normal University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Chinese Grammatical Error Correction (CGEC) plays a significant role in providing automatic feedback to students' writing, especially for Chinese as a Foreign Language Learner (CFL). Particularly, rudimentary CFLs write Chinese characters where phonological and visual confusion is constantly involved. However, existing CGEC studies ignore the multi-modality and potential faked errors (i.e., non-existent characters created due to writing errors), which pushes the techniques far away from real-world scenarios. To address this gap, we develop a dataset, namely VisCGEC, to benchmark the visual Chinese grammatical error correction for Chinese as a Foreign Language Learner (CFL). The dataset contains 2,451 images of handwritten sentences with grammatical errors and corresponding correction texts, which Chinese language experts meticulously annotate. In addition, we propose baseline approaches on VisCGEC and conduct experiments with two CGEC frameworks (i.e., a two-stage pipeline and an end-to-end system), providing a strong baseline for future research. Extensive empirical results and analyses demonstrate that VisCGEC is high-quality but challenging, where the best approach achieves an F0.5 score of only 28.9%. Our dataset and baseline methods are available at https://github.com/xiaoAugenstern/VisCGEC.

源语言英语
主期刊名Long Papers
编辑Luis Chiruzzo, Alan Ritter, Lu Wang
出版商Association for Computational Linguistics (ACL)
5054-5068
页数15
ISBN(电子版)9798891761896
DOI
出版状态已出版 - 2025
活动2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2025 - Hybrid, Albuquerque, 美国
期限: 29 4月 20254 5月 2025

出版系列

姓名Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies: Long Papers, NAACL-HLT 2025
1

会议

会议2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2025
国家/地区美国
Hybrid, Albuquerque
时期29/04/254/05/25

指纹

探究 'VisCGEC: Benchmarking the Visual Chinese Grammatical Error Correction' 的科研主题。它们共同构成独一无二的指纹。

引用此