Applying batch normalization to hybrid NN-HMM model for speech recognition

  • Hongjian Zhan*
  • , Guilin Chen
  • , Yue Lu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Batch Normalization has showed success in image classification and other image processing areas by reducing internal covariate shift in deep network model’s training procedure. In this paper, we propose to apply batch normalization to speech recognition within the hybrid NNHMM model. We evaluate the performance of this new method in the acoustic model of the hybrid system with a speaker-independent speech recognition task using some Chinese datasets. Compared to the former best model we used in the Chinese datasets, it shows that with batch normalization we can reach lower word error rate (WER) of 8%–13% relatively, meanwhile we just need 60% iterations of original model to finish the training procedure.

Original languageEnglish
Title of host publicationPattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings
EditorsTieniu Tan, Xilin Chen, Xuelong Li, Jian Yang, Hong Cheng, Jie Zhou
PublisherSpringer Verlag
Pages427-435
Number of pages9
ISBN (Print)9789811030048
DOIs
StatePublished - 2016

Publication series

NameCommunications in Computer and Information Science
Volume663
ISSN (Print)1865-0929

Fingerprint

Dive into the research topics of 'Applying batch normalization to hybrid NN-HMM model for speech recognition'. Together they form a unique fingerprint.

Cite this