深度自适应图递归网络的文本分类(CS.CL)

  • 2020 年 3 月 27 日
  • 笔记

句子状态LSTM(S-LSTM)是一个功能强大且高效的图形循环网络,该网络将单词视为节点,并同时在它们之间执行逐层循环步骤。尽管在文本表示方面取得了成功,但S-LSTM仍然存在两个缺点。首先,给定一个句子,某些单词通常比其他单词更含糊,因此,对于这些难解的单词需要采取更多的计算步骤,反之亦然。但是,S-LSTM对所有单词都采用固定的计算步骤,而不管其硬度如何。第二个原因是缺乏对自然语言固有的重要顺序信息(例如单词顺序)。在本文中,我们尝试解决这些问题,并为S-LSTM提出一种深度自适应机制,该机制使模型可以根据需要学习针对不同单词进行的计算步骤。此外,我们集成了一个额外的RNN层以注入顺序信息,这也可以作为决定自适应深度的输入功能。经典文本分类任务(24个不同大小和域的数据集)的结果表明,我们的模型相对于常规S-LSTM和其他高性能模型(例如,Transformer)进行了重大改进,同时实现了良好的准确性-速度贸易关。

原文标题:Depth-Adaptive Graph Recurrent Network for Text Classification

原文:The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously. Despite its successes on text representations, the S-LSTM still suffers from two drawbacks. Firstly, given a sentence, certain words are usually more ambiguous than others, and thus more computation steps need to be taken for these difficult words and vice versa. However, the S-LSTM takes fixed computation steps for all words, irrespective of their hardness. The secondary one comes from the lack of sequential information (e.g., word order) that is inherently important for natural language. In this paper, we try to address these issues and propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required. In addition, we integrate an extra RNN layer to inject sequential information, which also serves as an input feature for the decision of adaptive depths. Results on the classic text classification task (24 datasets in various sizes and domains) show that our model brings significant improvements against the conventional S-LSTM and other high-performance models (e.g., the Transformer), meanwhile achieving a good accuracy-speed trade off.

原文作者:Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

原文地址:https://arxiv.org/abs/2003.00166