Cross-domain document layout analysis using document style guide

  • Xingjiao Wu
  • , Luwei Xiao
  • , Xiangcheng Du
  • , Yingbin Zheng
  • , Xin Li
  • , Tianlong Ma
  • , Cheng Jin*
  • , Liang He
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Document layout analysis (DLA) is a crucial computer vision task that involves partitioning document images into high-level semantic regions such as figures, tables, backgrounds, and texts. Deep learning models for DLA typically require a large amount of labeled data, which can be expensive. Though some researchers use generated data for training, a substantial style gap exists between the generated and target data. Moreover, it is necessary to improve the quality of the generated samples to achieve better control. To address these challenges, we propose a cross-domain DLA framework called DL-DSG, which leverages document-style guidance. DL-DSG comprises three components: the document layout generator (DLG) responsible for generating document element locations, the document element decorator (DED) for filling the elements, and the document style discriminator (DSD) for style guidance. In addition to generating controlled documents, we also focus on bridging the gap between the generated and target samples. To this end, we introduce a novel strategy that transforms document style judgment into the document cross-domain style guidance component. We evaluate the effectiveness of DL-DSG on popular DLA datasets, including PubLayNet, DSSE-200, CS-150, and CDSSE, and demonstrate its superior performance.

Original languageEnglish
Article number123039
JournalExpert Systems with Applications
Volume245
DOIs
StatePublished - 1 Jul 2024

Keywords

  • Data generation
  • Deep learning
  • Document cross-domain analysis
  • Document layout analysis

Fingerprint

Dive into the research topics of 'Cross-domain document layout analysis using document style guide'. Together they form a unique fingerprint.

Cite this