Abstractive Summarization Model with Adaptive Sparsemax

  • Shiqi Guo
  • , Yumeng Si
  • , Jing Zhao*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Abstractive summarization models mostly rely on Sequence-to-Sequence architectures, in which the softmax function is widely used to transform the model output to simplex. However, softmax’s output probability distribution often has the long-tail effect especially when the vocabulary size is large. Many unrelated tokens occupy too many probabilities so they will reduce the training efficiency and effect. More recently, some work has begun to design mapping functions to gain sparse output probabilities to ignore these irrelevant tokens. In this paper, we propose Adaptive Sparsemax which can self-adaptively control the sparsity of the model’s output. Our method combines sparsemax and temperature mechanism, and the temperature value can be learned by the neural network. One of the advantages of our method is that it doesn’t need any hyperparameter. The experimental result on CNN-Daily Mail and LCSTS dataset shows that our method has better performance on the abstractive summarization task than baseline models.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 11th CCF International Conference, NLPCC 2022, Proceedings
EditorsWei Lu, Shujian Huang, Yu Hong, Xiabing Zhou
PublisherSpringer Science and Business Media Deutschland GmbH
Pages810-821
Number of pages12
ISBN (Print)9783031171192
DOIs
StatePublished - 2022
Event11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022 - Guilin, China
Duration: 24 Sep 202225 Sep 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13551 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2022
Country/TerritoryChina
CityGuilin
Period24/09/2225/09/22

Keywords

  • Abstractive summarization
  • Adaptive sparsemax
  • Seq2Seq

Fingerprint

Dive into the research topics of 'Abstractive Summarization Model with Adaptive Sparsemax'. Together they form a unique fingerprint.

Cite this