DLE: Document Illumination Correction with Dynamic Light Estimation

  • Jiahao Quan
  • , Hailing Wang
  • , Chunwei Wu
  • , Guitao Cao*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Document images captured through mobile devices in natural environments are often affected by various types of illumination degradation. The degradation diminishes the clarity and readability of document images, thereby complicating their application to OCR downstream tasks. Existing methods typically address only one or a limited number of degradation types and do not consider the diversity of image degradation types. Additionally, these methods typically involve a pre-trained fixed sub-network to estimate background light or shadows, which lacks flexibility and adaptability. To overcome these challenges, this study proposes a novel framework named DLE, which comprises a two-loop generative adversarial network and a multi-modal discriminator. Specifically, to improve the quality of image representation, a mask extractor is embedded before the image input generator. This forces the model to focus on the distinct features in the image, enhancing the representation of illumination anomalous and degraded regions. The mask extractor generates a luminance mask to evaluate the difference in illumination between the input and target images. Subsequently, the consistency loss computation incorporates a dynamic optimization of the mask extractor, strengthening its ability to estimate the illumination degradation part. Moreover, a pre-trained visual-language model is introduced into the multi-modal discriminator, leveraging its robust cross-modal alignment capability to improve the semantic consistency of the generated images with the preset input text. Extensive experiments demonstrate that our approach achieves the SOTA performance in terms of edit distance (ED) and character error rate (CER).

Original languageEnglish
Title of host publication2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3701-3707
Number of pages7
ISBN (Electronic)9781665410205
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024 - Kuching, Malaysia
Duration: 6 Oct 202410 Oct 2024

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
ISSN (Print)1062-922X

Conference

Conference2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024
Country/TerritoryMalaysia
CityKuching
Period6/10/2410/10/24

Fingerprint

Dive into the research topics of 'DLE: Document Illumination Correction with Dynamic Light Estimation'. Together they form a unique fingerprint.

Cite this