深度自適應圖遞歸網絡的文本分類(CS.CL)

  • 2020 年 3 月 27 日
  • 筆記

句子狀態LSTM(S-LSTM)是一個功能強大且高效的圖形循環網絡,該網絡將單詞視為節點,並同時在它們之間執行逐層循環步驟。儘管在文本表示方面取得了成功,但S-LSTM仍然存在兩個缺點。首先,給定一個句子,某些單詞通常比其他單詞更含糊,因此,對於這些難解的單詞需要採取更多的計算步驟,反之亦然。但是,S-LSTM對所有單詞都採用固定的計算步驟,而不管其硬度如何。第二個原因是缺乏對自然語言固有的重要順序信息(例如單詞順序)。在本文中,我們嘗試解決這些問題,並為S-LSTM提出一種深度自適應機制,該機制使模型可以根據需要學習針對不同單詞進行的計算步驟。此外,我們集成了一個額外的RNN層以注入順序信息,這也可以作為決定自適應深度的輸入功能。經典文本分類任務(24個不同大小和域的數據集)的結果表明,我們的模型相對於常規S-LSTM和其他高性能模型(例如,Transformer)進行了重大改進,同時實現了良好的準確性-速度貿易關。

原文標題:Depth-Adaptive Graph Recurrent Network for Text Classification

原文:The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously. Despite its successes on text representations, the S-LSTM still suffers from two drawbacks. Firstly, given a sentence, certain words are usually more ambiguous than others, and thus more computation steps need to be taken for these difficult words and vice versa. However, the S-LSTM takes fixed computation steps for all words, irrespective of their hardness. The secondary one comes from the lack of sequential information (e.g., word order) that is inherently important for natural language. In this paper, we try to address these issues and propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required. In addition, we integrate an extra RNN layer to inject sequential information, which also serves as an input feature for the decision of adaptive depths. Results on the classic text classification task (24 datasets in various sizes and domains) show that our model brings significant improvements against the conventional S-LSTM and other high-performance models (e.g., the Transformer), meanwhile achieving a good accuracy-speed trade off.

原文作者:Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou

原文地址:https://arxiv.org/abs/2003.00166