TY - GEN
T1 - Image Layer Modeling for Complex Document Layout Generation
AU - Ma, Tianlong
AU - Wu, Xingjiao
AU - Du, Xiangcheng
AU - Wang, Yanlong
AU - Jin, Cheng
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Document layout analysis (DLA) plays an essential role in information extraction and document understanding. At present, DLA has reached the milestone achievement; however, DLA of non-Manhattan is still challenging because of annotation data limitations. In this paper, we propose an image layer modeling method to mitigate this issue. The image layer modeling method generates document images of non-Manhattan layouts by superimposing images under pre-defined aesthetic rules. Due to the lack of evaluation benchmark for non-Manhattan layout, we have constructed a manually-labeled non-Manhattan layout fine-grained segmentation dataset. To the best of our knowledge, this is the first manually-labeled non-Manhattan layout fine-grained segmentation dataset. Extensive experimental results verify that our proposed image layer modeling method can better deal with the fine-grained segmented document of the non-Manhattan layout.
AB - Document layout analysis (DLA) plays an essential role in information extraction and document understanding. At present, DLA has reached the milestone achievement; however, DLA of non-Manhattan is still challenging because of annotation data limitations. In this paper, we propose an image layer modeling method to mitigate this issue. The image layer modeling method generates document images of non-Manhattan layouts by superimposing images under pre-defined aesthetic rules. Due to the lack of evaluation benchmark for non-Manhattan layout, we have constructed a manually-labeled non-Manhattan layout fine-grained segmentation dataset. To the best of our knowledge, this is the first manually-labeled non-Manhattan layout fine-grained segmentation dataset. Extensive experimental results verify that our proposed image layer modeling method can better deal with the fine-grained segmented document of the non-Manhattan layout.
KW - Docuemnt layout analysis
KW - data augmentation
KW - deep learning
KW - non-Manhattan layout
UR - https://www.scopus.com/pages/publications/85171181274
U2 - 10.1109/ICME55011.2023.00386
DO - 10.1109/ICME55011.2023.00386
M3 - 会议稿件
AN - SCOPUS:85171181274
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 2261
EP - 2266
BT - Proceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023
PB - IEEE Computer Society
T2 - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023
Y2 - 10 July 2023 through 14 July 2023
ER -