跳到主要导航 跳到搜索 跳到主要内容

ReELFA: A scene text recognizer with encoded location and focused attention

  • Qingqing Wang*
  • , Wenjing Jia
  • , Xiangjian He
  • , Yue Lu
  • , Michael Blumenstein
  • , Ye Huang
  • , Shujing Lyu
  • *此作品的通讯作者
  • East China Normal University
  • University of Technology Sydney

科研成果: 会议稿件论文同行评审

摘要

LSTM and attention mechanism have been widely used for scene text recognition. However, the existing LSTM-based recognizers usually convert 2D feature maps into 1D space by flattening or pooling operations, resulting in the neglect of spatial information of text images. Additionally, the attention drift problem, where models fail to align targets at proper feature regions, has a serious impact on the recognition performance of existing models. To tackle the above problems, in this paper, we propose a scene text Recognizer with Encoded Location and Focused Attention, i.e., ReELFA. Our ReELFA utilizes one-hot encoded coordinates to indicate the spatial relationship of pixels and character center masks to help focus attention on the right feature areas. Experiments conducted on the benchmarking datasets IIIT5K, SVT, CUTE and IC15 demonstrate that the proposed method achieves comparable performance on the regular, low-resolution and noisy text images, and outperforms state-of-the-art approaches on the more challenging curved text images.

源语言英语
71-76
页数6
DOI
出版状态已出版 - 2019
活动2nd International Workshop on Machine Learning, WML 2019 - ICDAR 2019 Workshop - Sydney, 澳大利亚
期限: 21 9月 201922 9月 2019

会议

会议2nd International Workshop on Machine Learning, WML 2019 - ICDAR 2019 Workshop
国家/地区澳大利亚
Sydney
时期21/09/1922/09/19

指纹

探究 'ReELFA: A scene text recognizer with encoded location and focused attention' 的科研主题。它们共同构成独一无二的指纹。

引用此