Skip to main navigation Skip to search Skip to main content

EPIC: Error Pattern Informed Correction for Classroom ASR with Limited Labeled Data

  • Linzhao Jia
  • , Han Sun
  • , Yuang Wei
  • , Changyong Qi
  • , Xiaozhe Yang*
  • *Corresponding author for this work
  • East China Normal University

Research output: Contribution to journalConference articlepeer-review

Abstract

Automatic speech recognition (ASR) systems have a wide range of applications in classroom analysis. However, due to the unique structure of classroom dialogue, existing ASR systems often struggle to accurately recognize and organize spoken utterances, creating significant challenges for downstream tasks in educational dialogue analysis. To address this issue, we propose EPIC, a post-processing framework for classroom ASR error correction. We begin by extracting error patterns to gain a deeper understanding of the distribution of ASR errors. Next, we utilize large language models (LLMs) to reconstruct contextual information based on these error patterns, offering a viable solution for error correction with limited labeled data. Finally, after fine-tuning an error correction model, we implement a candidate selection process to identify the most appropriate hypothesis for each context. Extensive experiments with our proposed method demonstrate substantial improvements in word error rate (WER) and overall robustness in ASR error correction, enabling more reliable analysis of educational dialogues and offering deeper insights for educational research.

Keywords

  • ASR Error Correction
  • Classroom Dialogue
  • Large Language Models
  • Limited Labeled Data

Fingerprint

Dive into the research topics of 'EPIC: Error Pattern Informed Correction for Classroom ASR with Limited Labeled Data'. Together they form a unique fingerprint.

Cite this