基于语言模型调节和位置建模为的摘要式文本摘要

我们对掌握多少预训练语言模型相关知识才能有利于执行摘要式摘要任务方面展开了研究。为此,实验时,我们在BERT语言模型的网络模型基础上调节了转换器的编码器和解码器。此外,我们提出了一种新的BERT窗口化方法,利用这种方法可以按字节读取顺序处理比BERT窗口更长的文本。我们还对位置建模 – 本地文本计算的明确限制 – 如何影响转换器的摘要能力方面进行了研究。研究时,我们在编码器的首层中引入了二维卷积自我注意。然后,将各模型结果与基线以及CNN/每日邮报数据集上的现代化模型结果进行对比。为了证明该方法同样适用于德语,我们还在SwissText数据集上对我们的模型进行了相关训练。对比结果显示,我们模型在两个数据集上的ROUGE分值均高于基线值,这就表明这些模型非常适合用于手动定性分析。

原文标题:Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling

We explore to what extent knowledge about the pre-trained language model that is used is beneficial for the task of abstractive summarization. To this end, we experiment with conditioning the encoder and decoder of a Transformer-based neural model on the BERT language model. In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size. We also explore how locality modelling, i.e., the explicit restriction of calculations to the local context, can affect the summarization ability of the Transformer. This is done by introducing 2-dimensional convolutional self-attention into the first layers of the encoder. The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset. We additionally train our model on the SwissText dataset to demonstrate usability on German. Both models outperform the baseline in ROUGE scores on two datasets and show its superiority in a manual qualitative analysis.

原文作者:Dmitrii Aksenov, Julián Moreno-Schneider, Peter Bourgonje, Robert Schwarzenberg, Leonhard Hennig, Georg Rehm

原文:https://arxiv.org/abs/2003.13027