A search engine for imaged documents in PDF files

Yue Lu, Li Zhang, Chew Lim Tan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Large quantities of documents in the Internet and digital libraries are simply scanned and archived in image format, many of which are packed in PDF files. The word search tool provided by Adobe Reader/Acrobat does not work for these imaged documents. In this paper, we present a search engine to deal with this issue for imaged documents in PDF files. The experimental results show an encouraging performance.

Original languageEnglish
Title of host publicationProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery (ACM)
Pages536-537
Number of pages2
ISBN (Print)1581138814, 9781581138818
DOIs
StatePublished - 2004
Externally publishedYes
EventProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Sheffield, United Kingdom
Duration: 25 Jul 200429 Jul 2004

Publication series

NameProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

ConferenceProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Country/TerritoryUnited Kingdom
CitySheffield
Period25/07/0429/07/04

Keywords

  • Imaged Document
  • PDF files
  • Word searching

Fingerprint

Dive into the research topics of 'A search engine for imaged documents in PDF files'. Together they form a unique fingerprint.

Cite this