LSTMVAEF: Vivid Layout via LSTM-Based Variational Autoencoder Framework

Jie He, Xingjiao Wu, Wenxin Hu, Jing Yang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The lack of training data is still a challenge in the Document Layout Analysis task (DLA). Synthetic data is an effective way to tackle this challenge. In this paper, we propose an LSTM-based Variational Autoencoder framework (LSTMVAF) to synthesize layouts for DLA. Compared with the previous method, our method can generate more complicated layouts and only need training data from DLA without extra annotation. We use LSTM models as basic models to learn the potential representing of class and position information of elements within a page. It is worth mentioning that we design a weight adaptation strategy to help model train faster. The experiment shows our model can generate more vivid layouts that only need a few real document pages.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2021 - 16th International Conference, Proceedings
EditorsJosep Lladós, Daniel Lopresti, Seiichi Uchida
PublisherSpringer Science and Business Media Deutschland GmbH
Pages176-189
Number of pages14
ISBN (Print)9783030863302
DOIs
StatePublished - 2021
Event16th International Conference on Document Analysis and Recognition, ICDAR 2021 - Lausanne, Switzerland
Duration: 5 Sep 202110 Sep 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12822 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Document Analysis and Recognition, ICDAR 2021
Country/TerritorySwitzerland
CityLausanne
Period5/09/2110/09/21

Keywords

  • Document Layout Analysis
  • Document generation
  • Variational Autoencoder

Fingerprint

Dive into the research topics of 'LSTMVAEF: Vivid Layout via LSTM-Based Variational Autoencoder Framework'. Together they form a unique fingerprint.

Cite this