TaskSum: Task-Driven Extractive Text Summarization for Long News Documents Based on Reinforcement Learning

Moming Tang, Dawei Cheng, Cen Chen*, Yuqi Liang, Yifeng Luo, Weining Qian

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A popular and state-of-the-art family of extractive summarization is to explore pre-trained language models through reinforcement learning (RL). Despite gaining promising results, existing RL-based methods suffer from three drawbacks. First, they often adopt sparse reward signal schemes, which only give rewards to some of the extracted sentences, and result in neglecting salient sentences. Second, they often deem summarization as an independent task and neglect the latent connections existing between summarization and other downstream tasks, that could provide insightful hints to guide the upstream summarization task in return. Third, the length of input sequences in most summarization methods is restricted by the utilized pre-trained language models. To address these problems, we propose a novel RL-based Seq2Seq extractive summarization model, namely TaskSum, which combines extractive text summarization with multiple associated tasks via a dense reward signal scheme. Moreover, we implement a BERT-based hierarchical encoder to effectively encode documents of arbitrary length. Empirical results demonstrate that TaskSum can overcome the above-mentioned drawbacks of existing RL-based summarization methods and achieve significantly better results for long documents.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 27th International Conference, DASFAA 2022, Proceedings
EditorsArnab Bhattacharya, Janice Lee Mong Li, Divyakant Agrawal, P. Krishna Reddy, Mukesh Mohania, Anirban Mondal, Vikram Goyal, Rage Uday Kiran
PublisherSpringer Science and Business Media Deutschland GmbH
Pages306-313
Number of pages8
ISBN (Print)9783031001284
DOIs
StatePublished - 2022
Event27th International Conference on Database Systems for Advanced Applications, DASFAA 2022 - Virtual, Online
Duration: 11 Apr 202214 Apr 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13247 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Database Systems for Advanced Applications, DASFAA 2022
CityVirtual, Online
Period11/04/2214/04/22

Keywords

  • Extractive summarization
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'TaskSum: Task-Driven Extractive Text Summarization for Long News Documents Based on Reinforcement Learning'. Together they form a unique fingerprint.

Cite this