Image Layer Modeling for Complex Document Layout Generation

Tianlong Ma, Xingjiao Wu, Xiangcheng Du, Yanlong Wang, Cheng Jin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Document layout analysis (DLA) plays an essential role in information extraction and document understanding. At present, DLA has reached the milestone achievement; however, DLA of non-Manhattan is still challenging because of annotation data limitations. In this paper, we propose an image layer modeling method to mitigate this issue. The image layer modeling method generates document images of non-Manhattan layouts by superimposing images under pre-defined aesthetic rules. Due to the lack of evaluation benchmark for non-Manhattan layout, we have constructed a manually-labeled non-Manhattan layout fine-grained segmentation dataset. To the best of our knowledge, this is the first manually-labeled non-Manhattan layout fine-grained segmentation dataset. Extensive experimental results verify that our proposed image layer modeling method can better deal with the fine-grained segmented document of the non-Manhattan layout.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023
PublisherIEEE Computer Society
Pages2261-2266
Number of pages6
ISBN (Electronic)9781665468916
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE International Conference on Multimedia and Expo, ICME 2023 - Brisbane, Australia
Duration: 10 Jul 202314 Jul 2023

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2023-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2023 IEEE International Conference on Multimedia and Expo, ICME 2023
Country/TerritoryAustralia
CityBrisbane
Period10/07/2314/07/23

Keywords

  • Docuemnt layout analysis
  • data augmentation
  • deep learning
  • non-Manhattan layout

Fingerprint

Dive into the research topics of 'Image Layer Modeling for Complex Document Layout Generation'. Together they form a unique fingerprint.

Cite this