TY - GEN
T1 - Handwritten Digit String Recognition using Convolutional Neural Network
AU - Zhan, Hongjian
AU - Lyu, Shujing
AU - Lu, Yue
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/26
Y1 - 2018/11/26
N2 - String recognition is one of the most important tasks in computer vision applications. Recently the combinations of convolutional neural network (CNN) and recurrent neural network (RNN) have been widely applied to deal with the issue of string recognition. However RNNs are not only hard to train but also time-consuming. In this paper, we propose a new architecture which is based on CNN only, and apply it to handwritten digit string recognition (HDSR). This network is composed of three parts from bottom to top: Feature extraction layers, feature dimension transposition layers and an output layer. Motivated by its super performance of DenseNet, we utilize dense blocks to conduct feature extraction. At the top of the network, a CTC (connectionist temporal classification) output layer is used to calculate the loss and decode the feature sequence, while some feature dimension transposition layers are applied to connect feature extraction and output layer. The experiments have demonstrated that, compared to other methods, the proposed method obtains significant improvements on ORAND-CAR-A and ORAND-CAR-B datasets with recognition rates 92.2% and 94.02%, respectively.
AB - String recognition is one of the most important tasks in computer vision applications. Recently the combinations of convolutional neural network (CNN) and recurrent neural network (RNN) have been widely applied to deal with the issue of string recognition. However RNNs are not only hard to train but also time-consuming. In this paper, we propose a new architecture which is based on CNN only, and apply it to handwritten digit string recognition (HDSR). This network is composed of three parts from bottom to top: Feature extraction layers, feature dimension transposition layers and an output layer. Motivated by its super performance of DenseNet, we utilize dense blocks to conduct feature extraction. At the top of the network, a CTC (connectionist temporal classification) output layer is used to calculate the loss and decode the feature sequence, while some feature dimension transposition layers are applied to connect feature extraction and output layer. The experiments have demonstrated that, compared to other methods, the proposed method obtains significant improvements on ORAND-CAR-A and ORAND-CAR-B datasets with recognition rates 92.2% and 94.02%, respectively.
UR - https://www.scopus.com/pages/publications/85059745662
U2 - 10.1109/ICPR.2018.8546100
DO - 10.1109/ICPR.2018.8546100
M3 - 会议稿件
AN - SCOPUS:85059745662
T3 - Proceedings - International Conference on Pattern Recognition
SP - 3729
EP - 3734
BT - 2018 24th International Conference on Pattern Recognition, ICPR 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 24th International Conference on Pattern Recognition, ICPR 2018
Y2 - 20 August 2018 through 24 August 2018
ER -