跳到主要导航 跳到搜索 跳到主要内容

ParaSum: Contrastive Paraphrasing for Low-Resource Extractive Text Summarization

  • East China Normal University
  • Alibaba Group Holding Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Existing extractive summarization methods achieve state-of-the-art (SOTA) performance with pre-trained language models (PLMs) and sufficient training data. However, PLM-based methods are known to be data-hungry and often fail to deliver satisfactory results in low-resource scenarios. Constructing a high-quality summarization dataset with human-authored reference summaries is a prohibitively expensive task. To address these challenges, this paper proposes a novel paradigm for low-resource extractive summarization, called ParaSum. This paradigm reformulates text summarization as textual paraphrasing, aligning the text summarization task with the self-supervised Next Sentence Prediction (NSP) task of PLMs. This approach minimizes the training gap between the summarization model and PLMs, enabling a more effective probing of the knowledge encoded within PLMs and enhancing the summarization performance. Furthermore, to relax the requirement for large amounts of training data, we introduce a simple yet efficient model and align the training paradigm of summarization to textual paraphrasing to facilitate network-based transfer learning. Extensive experiments over two widely used benchmarks (i.e., CNN/DailyMail, Xsum) and a recent open-sourced high-quality Chinese benchmark (i.e., CNewSum) show that ParaSum consistently outperforms existing PLM-based summarization methods in all low-resource settings, demonstrating its effectiveness over different types of datasets.

源语言英语
主期刊名Knowledge Science, Engineering and Management - 16th International Conference, KSEM 2023, Proceedings
编辑Zhi Jin, Yuncheng Jiang, Wenjun Ma, Robert Andrei Buchmann, Ana-Maria Ghiran, Yaxin Bi
出版商Springer Science and Business Media Deutschland GmbH
106-119
页数14
ISBN(印刷版)9783031402883
DOI
出版状态已出版 - 2023
活动Knowledge Science, Engineering and Management - 16th International Conference, KSEM 2023, Proceedings - Guangzhou, 中国
期限: 16 8月 202318 8月 2023

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14119 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议Knowledge Science, Engineering and Management - 16th International Conference, KSEM 2023, Proceedings
国家/地区中国
Guangzhou
时期16/08/2318/08/23

指纹

探究 'ParaSum: Contrastive Paraphrasing for Low-Resource Extractive Text Summarization' 的科研主题。它们共同构成独一无二的指纹。

引用此