Automatic text generation via text extraction based on submodular

  • Lisi Ai
  • , Na Li
  • , Jianbing Zheng
  • , Ming Gao*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Automatic text generation is the generation of natural language texts by computer. It has many applications, including automatic report generation, online promotion, etc. However, the problem is still a challenged task due to the lack of readability and coherence even there are many existing works studied it. In this paper, we propose a two-phase algorithm, which consists of text cleanup and text extraction, to automatically generate text from multiple texts. In the first phase, we generate paragraphs based on the topic modeling and clustering analysis. In the second phase, we model the text extraction as a set covering problem after we find the keywords in terms of the scores of TF-IDF, and solve the problem via employing the tool of submodular. We conduct a set of experiments to evaluate our proposed method and experimental results demonstrate the effectiveness of our proposed method by comparing with some comparable baselines.

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10612 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st Asia-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2017 held in Conjuction with the International Workshop on Mobile Web Data Analytics, MWDA 2017, International Workshop on Hot Topics in Big Spatial Data and Urban Computing, HotSpatial 2017, International Workshop on Graph Data Management and Analysis, GDMA 2017, 2nd International Workshop on Data Driven Crowdsourcing, DDC 2017, 2nd International Workshop on Spatio-temporal Data Management and Analytics, SDMA 2017 and International Workshop on Mobility Analytics from Spatial and Social Data, MASS 2017
Country/TerritoryChina
CityBeijing
Period7/07/179/07/17

Keywords

  • Automatic text generation
  • K-Means
  • Massive information
  • Submodular

Fingerprint

Dive into the research topics of 'Automatic text generation via text extraction based on submodular'. Together they form a unique fingerprint.

Cite this