TY - JOUR
T1 - Document image layout analysis via explicit edge embedding network
AU - Wu, Xingjiao
AU - Zheng, Yingbin
AU - Ma, Tianlong
AU - Ye, Hao
AU - He, Liang
N1 - Publisher Copyright:
© 2021 Elsevier Inc.
PY - 2021/10
Y1 - 2021/10
N2 - Layout analysis from a document image plays an important role in document content understanding and information extraction systems. While many existing methods focus on learning knowledge with convolutional networks directly from color channels, we argue the importance of high-frequency structures in document images, especially edge information. In this paper, we present a novel document layout analysis framework with the Explicit Edge Embedding Network (E3 Net). Specifically, the proposed network contains the edge embedding block and dynamic skip connection block to produce detailed features, as well as a lightweight fully convolutional subnet as the backbone for the effectiveness of the framework. The edge embedding block is designed to explicitly incorporate the edge information from the document images. The dynamic skip connection block aims to learn both color and edge representations with learnable weights. In contrast to the previous methods, we harness the model by using a synthetic document approach to overcome data scarcity. The combination of data augmentation and edge embedding is important toward a more compact representation than directly using the training images with only color channels. We conduct experiments using the proposed framework on three document layout analysis benchmarks and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.
AB - Layout analysis from a document image plays an important role in document content understanding and information extraction systems. While many existing methods focus on learning knowledge with convolutional networks directly from color channels, we argue the importance of high-frequency structures in document images, especially edge information. In this paper, we present a novel document layout analysis framework with the Explicit Edge Embedding Network (E3 Net). Specifically, the proposed network contains the edge embedding block and dynamic skip connection block to produce detailed features, as well as a lightweight fully convolutional subnet as the backbone for the effectiveness of the framework. The edge embedding block is designed to explicitly incorporate the edge information from the document images. The dynamic skip connection block aims to learn both color and edge representations with learnable weights. In contrast to the previous methods, we harness the model by using a synthetic document approach to overcome data scarcity. The combination of data augmentation and edge embedding is important toward a more compact representation than directly using the training images with only color channels. We conduct experiments using the proposed framework on three document layout analysis benchmarks and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.
KW - Deep learning
KW - Document layout analysis
KW - Dynamic skip connection
KW - Edge embedding block
UR - https://www.scopus.com/pages/publications/85110511434
U2 - 10.1016/j.ins.2021.07.020
DO - 10.1016/j.ins.2021.07.020
M3 - 文章
AN - SCOPUS:85110511434
SN - 0020-0255
VL - 577
SP - 436
EP - 448
JO - Information Sciences
JF - Information Sciences
ER -