A handwritten chinese text recognizer applying multi-level multimodal fusion network

Yuhuan Xiu, Qingqing Wang, Hongjian Zhan, Man Lan, Yue Lu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Handwritten Chinese text recognition (HCTR) has received extensive attention from the community of pattern recognition in the past decades. Most existing deep learning methods consist of two stages, i.e., training a text recognition network on the base of visual information, followed by incorporating language constrains with various language models. Therefore, the inherent linguistic semantic information is often neglected when designing the recognition network. To tackle this problem, in this work, we propose a novel multi-level multimodal fusion network and properly embed it into an attention-based LSTM so that both the visual information and the linguistic semantic information can be fully leveraged when predicting sequential outputs from the feature vectors. Experimental results on the ICDAR-2013 competition dataset demonstrate a comparable result with the state-of-the-art approaches.

Original languageEnglish
Title of host publicationProceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PublisherIEEE Computer Society
Pages1464-1469
Number of pages6
ISBN (Electronic)9781728128610
DOIs
StatePublished - Sep 2019
Event15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019 - Sydney, Australia
Duration: 20 Sep 201925 Sep 2019

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
ISSN (Print)1520-5363

Conference

Conference15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
Country/TerritoryAustralia
CitySydney
Period20/09/1925/09/19

Keywords

  • Attention based LSTM
  • Handwritten Chinese text recognition
  • Language model
  • Linguistic semantic information
  • Multi-level multimodal fusion

Fingerprint

Dive into the research topics of 'A handwritten chinese text recognizer applying multi-level multimodal fusion network'. Together they form a unique fingerprint.

Cite this