Keyword searching in compressed document images

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Summary form only given. A compressed pattern matching method for searching keywords from the CCIT group 4-compressed document images, without explicit decompression, is presented. According to the CCIT Group 4 standards, each coded position indicates current pixel color is different from its previous pixel, except for the next coded positions of the pass mode. The changing elements from the compressed images are extracted and are then utilized to segment and bound the word objects and to measure the similarity of two word images. A two-stage matching strategy is constructed to measure the dissimilarity between the template image of the user's query word and the word extracted from document images. Experiments were conducted to verify the validity of the approach. The results show that the proposed approach was much faster than the traditional approach, because it avoids the pixel-level processing for analyzing the connected components and extracting word features.

Original languageEnglish
Title of host publicationProceedings - DCC 2003
Subtitle of host publicationData Compression Conference
EditorsJames A. Storer, Martin Cohn
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages437
Number of pages1
ISBN (Electronic)0769518966
DOIs
StatePublished - 2003
Externally publishedYes
EventData Compression Conference, DCC 2003 - Snowbird, United States
Duration: 25 Mar 200327 Mar 2003

Publication series

NameData Compression Conference Proceedings
Volume2003-January
ISSN (Print)1068-0314

Conference

ConferenceData Compression Conference, DCC 2003
Country/TerritoryUnited States
CitySnowbird
Period25/03/0327/03/03

Keywords

  • Books
  • Code standards
  • Computer science
  • Image coding
  • Image segmentation
  • Internet
  • Keyword search
  • Pattern matching
  • Pixel
  • Software libraries

Fingerprint

Dive into the research topics of 'Keyword searching in compressed document images'. Together they form a unique fingerprint.

Cite this