DOCUMENT LAYOUT ANALYSIS VIA POSITIONAL ENCODING

Ejian Zhou, Xingjiao Wu, Luwei Xiao, Xiangcheng Du, Tianlong Ma, Liang He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Document layout analysis plays a vital role in computer vision research. Current document layout analysis methods mostly use pixel-based classification for document layout analysis. However, the method based on pixel classification is insufficient for maintaining the continuity of the classification area. In this paper, we propose a document layout analysis method based on positional encoding and bounding box specification. We maintain the continuity of the analysis area by constructing a document layout analysis framework based on the bounding box. In addition, we also integrate a positional encoding module in the framework to maintain the detailed information in the document layout analysis and modeling process. Experimental results prove that our proposed method has achieved state-of-the-art results.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Image Processing, ICIP 2022 - Proceedings
PublisherIEEE Computer Society
Pages1156-1160
Number of pages5
ISBN (Electronic)9781665496209
DOIs
StatePublished - 2022
Event29th IEEE International Conference on Image Processing, ICIP 2022 - Bordeaux, France
Duration: 16 Oct 202219 Oct 2022

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference29th IEEE International Conference on Image Processing, ICIP 2022
Country/TerritoryFrance
CityBordeaux
Period16/10/2219/10/22

Keywords

  • Document layout analysis
  • bounding box
  • deep learning
  • position-encoding

Fingerprint

Dive into the research topics of 'DOCUMENT LAYOUT ANALYSIS VIA POSITIONAL ENCODING'. Together they form a unique fingerprint.

Cite this