Word searching in CCITT group 4 compressed document images

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

In this paper, we present a compressed pattern matching method for searching user queried words in the CCITT Group 4 compressed document images, without decompressing. The feature pixels composed of black changing elements and white changing elements are extracted directly from the CCITT Group 4 compressed document images. The connected components are labeled based on a line-by-line strategy according to the relative positions between the changing elements of the current coding line and the changing elements of the reference line. Word boxes are bounded by merging the connected components. A two-stage matching strategy is constructed to measure the dissimilarity between the template image of the user's query word and the words extracted from document images. Experimental results confirmed the validity of the proposed approach.

Original languageEnglish
Title of host publicationProceedings - 7th International Conference on Document Analysis and Recognition, ICDAR 2003
PublisherIEEE Computer Society
Pages467-471
Number of pages5
ISBN (Electronic)0769519601
DOIs
StatePublished - 2003
Externally publishedYes
Event7th International Conference on Document Analysis and Recognition, ICDAR 2003 - Edinburgh, United Kingdom
Duration: 3 Aug 20036 Aug 2003

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume2003-January
ISSN (Print)1520-5363

Conference

Conference7th International Conference on Document Analysis and Recognition, ICDAR 2003
Country/TerritoryUnited Kingdom
CityEdinburgh
Period3/08/036/08/03

Fingerprint

Dive into the research topics of 'Word searching in CCITT group 4 compressed document images'. Together they form a unique fingerprint.

Cite this