TY - GEN
T1 - Handwritten Digit String Recognition for Indian Scripts
AU - Zhan, Hongjian
AU - Chowdhury, Pinaki Nath
AU - Pal, Umapada
AU - Lu, Yue
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - In many documents digits/numerals may touch each other and hence digit string recognition is necessary as segmentation of individual numeral from the touching string is difficult. In this paper, we propose a digit string recognition system for four Indian popular scripts. Here we consider strings of Kannada, Oriya, Tamil and Telugu scripts for our experiment. This paper has two contributions: (i) we have developed 4 datasets of digit string for each of these four scripts. Each dataset has 20000 numeral string samples for training and 30000 samples for testing. As there is no such dataset available, it will be helpful to the community (ii) we apply a RNN free CNN (Convolutional Neural Network) and CTC (Connectionist Temporal Classifica-tion) based architecture for numeral string recognition. Unlike normal text string, in string of digits has no contextual information among the digits and hence a digit may be followed by an arbitrary digit in a digit string. Because of such behaviors we apply a CNN and CTC based architecture without RNN for numeral string recognition. We tested our scheme on our different test datasets and results are provided.
AB - In many documents digits/numerals may touch each other and hence digit string recognition is necessary as segmentation of individual numeral from the touching string is difficult. In this paper, we propose a digit string recognition system for four Indian popular scripts. Here we consider strings of Kannada, Oriya, Tamil and Telugu scripts for our experiment. This paper has two contributions: (i) we have developed 4 datasets of digit string for each of these four scripts. Each dataset has 20000 numeral string samples for training and 30000 samples for testing. As there is no such dataset available, it will be helpful to the community (ii) we apply a RNN free CNN (Convolutional Neural Network) and CTC (Connectionist Temporal Classifica-tion) based architecture for numeral string recognition. Unlike normal text string, in string of digits has no contextual information among the digits and hence a digit may be followed by an arbitrary digit in a digit string. Because of such behaviors we apply a CNN and CTC based architecture without RNN for numeral string recognition. We tested our scheme on our different test datasets and results are provided.
KW - Connectionist Temporal Classification
KW - Convolutional Neural Network
KW - Postal Automation
KW - String recognition
UR - https://www.scopus.com/pages/publications/85081616180
U2 - 10.1007/978-3-030-41299-9_21
DO - 10.1007/978-3-030-41299-9_21
M3 - 会议稿件
AN - SCOPUS:85081616180
SN - 9783030412982
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 262
EP - 273
BT - Pattern Recognition - 5th Asian Conference, ACPR 2019, Revised Selected Papers
A2 - Palaiahnakote, Shivakumara
A2 - Sanniti di Baja, Gabriella
A2 - Wang, Liang
A2 - Yan, Wei Qi
PB - Springer
T2 - 5th Asian Conference on Pattern Recognition, ACPR 2019
Y2 - 26 November 2019 through 29 November 2019
ER -