Similarity measure for CCITT group 4 compressed document images

  • Y. Lu*
  • , C. L. Tan
  • , L. Fan
  • , W. Huang
  • *Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations

Abstract

Similarity measure of document images acts a crucial role in the area of document image retrieval. A method of measuring the similarity of CCITT Group 4 compressed document images is proposed in this paper. The features are extracted directly from the changing elements of the compressed images. Weighted Hausdorff distance is utilized to assign all of the word objects from two document images to corresponding classes by an unsupervised classifier, whereas the possible stop words are excluded. Document vectors are built by the occurrence frequency of the word object classes, and the pair-wise similarity of two document images is represented by the scalar product of the document vectors. Five group articles relating to different domains are used to test the validity of the presented approach.

Original languageEnglish
Pages1118-1121
Number of pages4
StatePublished - 2001
Externally publishedYes
EventIEEE International Conference on Image Processing (ICIP) 2001 - Thessaloniki, Greece
Duration: 7 Oct 200110 Oct 2001

Conference

ConferenceIEEE International Conference on Image Processing (ICIP) 2001
Country/TerritoryGreece
CityThessaloniki
Period7/10/0110/10/01

Fingerprint

Dive into the research topics of 'Similarity measure for CCITT group 4 compressed document images'. Together they form a unique fingerprint.

Cite this