TY - GEN
T1 - Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs
AU - Chen, Shengchang
AU - Lu, Shujing
AU - Wen, Ying
AU - Lu, Yue
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/20
Y1 - 2015/11/20
N2 - Different recognizers may result in different mistakes when they are used to recognize a Chinese address. In this paper, we present a method of combining multiple Chinese address recognition outputs to improve Chinese address recognition accuracy. The method first employs multiple sequence alignment to generate a lattice of candidate hypotheses from multiple different recognizer outputs and then applies statistical language model to choose the maximum likelihood candidate sequence. Taking the maximum as the final decision, the performance of our method is superior, compared to the single recognizers and Miyao's method. The experiments on the address images of real envelopes demonstrate that the proposed method increases the character recognition accuracy rate from 95.80% to 98.38%, with 61.30% error reduction. Furthermore, the corrected sorting rate of an automatic mail sorting system increases from 84.11% to 93.72%.
AB - Different recognizers may result in different mistakes when they are used to recognize a Chinese address. In this paper, we present a method of combining multiple Chinese address recognition outputs to improve Chinese address recognition accuracy. The method first employs multiple sequence alignment to generate a lattice of candidate hypotheses from multiple different recognizer outputs and then applies statistical language model to choose the maximum likelihood candidate sequence. Taking the maximum as the final decision, the performance of our method is superior, compared to the single recognizers and Miyao's method. The experiments on the address images of real envelopes demonstrate that the proposed method increases the character recognition accuracy rate from 95.80% to 98.38%, with 61.30% error reduction. Furthermore, the corrected sorting rate of an automatic mail sorting system increases from 84.11% to 93.72%.
KW - minimum edit distance
KW - multiple Chinese address recognition outputs
KW - multiple sequence alignment
KW - statistical language model
UR - https://www.scopus.com/pages/publications/84962536306
U2 - 10.1109/ICDAR.2015.7333742
DO - 10.1109/ICDAR.2015.7333742
M3 - 会议稿件
AN - SCOPUS:84962536306
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 151
EP - 155
BT - 13th IAPR International Conference on Document Analysis and Recognition, ICDAR 2015 - Conference Proceedings
PB - IEEE Computer Society
T2 - 13th International Conference on Document Analysis and Recognition, ICDAR 2015
Y2 - 23 August 2015 through 26 August 2015
ER -