Document image layout analysis via explicit edge embedding network

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

Layout analysis from a document image plays an important role in document content understanding and information extraction systems. While many existing methods focus on learning knowledge with convolutional networks directly from color channels, we argue the importance of high-frequency structures in document images, especially edge information. In this paper, we present a novel document layout analysis framework with the Explicit Edge Embedding Network (E3 Net). Specifically, the proposed network contains the edge embedding block and dynamic skip connection block to produce detailed features, as well as a lightweight fully convolutional subnet as the backbone for the effectiveness of the framework. The edge embedding block is designed to explicitly incorporate the edge information from the document images. The dynamic skip connection block aims to learn both color and edge representations with learnable weights. In contrast to the previous methods, we harness the model by using a synthetic document approach to overcome data scarcity. The combination of data augmentation and edge embedding is important toward a more compact representation than directly using the training images with only color channels. We conduct experiments using the proposed framework on three document layout analysis benchmarks and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.

Original languageEnglish
Pages (from-to)436-448
Number of pages13
JournalInformation Sciences
Volume577
DOIs
StatePublished - Oct 2021

Keywords

  • Deep learning
  • Document layout analysis
  • Dynamic skip connection
  • Edge embedding block

Fingerprint

Dive into the research topics of 'Document image layout analysis via explicit edge embedding network'. Together they form a unique fingerprint.

Cite this