Word spotting in Chinese document images without layout analysis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

An approach to searching user-specified words/phrases in Chinese document images, without the requirements of layout analysis, is proposed in this paper. Bounding boxes of Chinese character images are first determined using connected component analysis. Next, a suitable character from the user-specified word/phrase is chosen as the initial character to search for a matching candidate in the document. Once a matched candidate is found, its adjacent characters in the horizontal and vertical directions are examined for matching with other corresponding characters in the user-specified word/phrase, subject to the constraints of positional relation and size similarity. The character matching is done in two stages. The coarse matching is carried out based on the stroke density features. A weighted Hausdorff disiance(WHD) is proposed for the second matching phase. Experimental results show that the proposed method can effectively search the user-specified Chinese word/phrase from horizontal or vertical text lines of document images.

Original languageEnglish
Title of host publicationProceedings - 16th International Conference on Pattern Recognition, ICPR 2002
EditorsG. Sanniti di Baja, Y. Shirai, M. Kunt, D. Laurendeau, R. Woodham, K. Boyer, L. Shapiro, R. Kasturi, C. Suen, N. Ayache, H. Bunke, H. Christensen
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages57-60
Number of pages4
ISBN (Electronic)0769516963
DOIs
StatePublished - 2002
Externally publishedYes
Event16th International Conference on Pattern Recognition, ICPR 2002 - Quebec City, Canada
Duration: 11 Aug 200215 Aug 2002

Publication series

NameProceedings - International Conference on Pattern Recognition
Volume3
ISSN (Print)1051-4651

Conference

Conference16th International Conference on Pattern Recognition, ICPR 2002
Country/TerritoryCanada
CityQuebec City
Period11/08/0215/08/02

Fingerprint

Dive into the research topics of 'Word spotting in Chinese document images without layout analysis'. Together they form a unique fingerprint.

Cite this