用预训练的语言模型进行端到端命名实体识别和关系提取 (CS CompLang)
- 2020 年 1 月 2 日
- 筆記
命名实体识别(NER)和关系提取(RE)是信息提取和检索(IE &IR)中的两个重要任务。最近的工作表明,共同学习这些任务是有益的,这可以避免传播基于管道系统的固有错误并提高性能。但是,最新的联合模型通常依赖于外部自然语言处理(NLP)工具(例如依赖项解析器),将其用处仅限于这些工具运行良好的领域(例如新闻)。已经提出的少数神经网络,端到端模型几乎完全是从头开始训练的。在本文中,我们提出了一种神经网络的端到端模型,用于联合提取实体及其之间的关系,该模型不依赖外部NLP工具,并且集成了一个大型的、经过预先训练的语言模型。由于我们模型的大部分参数都是经过预先训练的,并且避免进行自我注意的重复,因此我们的模型可以快速进行训练。在3个域的5个数据集上,我们的模型有时甚至大幅度地达到或超过了最新的性能。
原文题目:End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models
原文:Named entity recognition (NER) and relation extraction (RE) are two important tasks in information extraction and retrieval (IE & IR). Recent work has demonstrated that it is beneficial to learn these tasks jointly, which avoids the propagation of error inherent in pipeline-based systems and improves performance. However, state-of-the-art joint models typically rely on external natural language processing (NLP) tools, such as dependency parsers, limiting their usefulness to domains (e.g. news) where those tools perform well. The few neural, end-to-end models that have been proposed are trained almost completely from scratch. In this paper, we propose a neural, end-to-end model for jointly extracting entities and their relations which does not rely on external NLP tools and which integrates a large, pre-trained language model. Because the bulk of our model's parameters are pre-trained and we eschew recurrence for self-attention, our model is fast to train. On 5 datasets across 3 domains, our model matches or exceeds state-of-the-art performance, sometimes by a large margin.
原文作者:John Giorgi,Xindi Wang,Nicola Sahar,Won Young Shin,Gary D. Bader,Bo Wang
原文地址:https://arxiv.org/abs/1912.13415