面向網路安全的智慧命名實體識別的深度學習方法(CS CL)

近年來,以非結構化文本形式生成的網路安全數據,例如,社交媒體資源、部落格、文章等,數量異常增加。命名實體識別(NER)是將這個非結構化數據轉換成結構化數據的第一步,這些數據可以被很多應用程式使用。現有的網路安全數據處理方法基於規則和語言特徵,提出了一種基於深度學習的嵌入條件隨機的方法。 為了找到最優的體系結構對幾種DL體系結構進行了評估。雙向門控循環單元(Bi-GRU)、卷積神經網路(CNN)和CRF的組合在一個公開可用的基準數據集上比其他各種DL框架表現得更好。這可能是由於雙向結構按順序保留了與未來詞和先前詞相關的特徵。

原文題目Deep Learning Approach for Intelligent Named Entity Recognition of Cyber Security

原文:In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data are based on rules and linguistic characteristics. A Deep Learning (DL) based approach embedded with Conditional Random Fields (CRFs) is proposed in this paper. Several DL architectures are evaluated to find the most optimal architecture. The combination of Bidirectional Gated Recurrent Unit (Bi-GRU), Convolutional Neural Network (CNN), and CRF performed better compared to various other DL frameworks on a publicly available benchmark dataset. This may be due to the reason that the bidirectional structures preserve the features related to the future and previous words in a sequence.

原文作者:Vinayakumar R

原文地址:https://arxiv.org/abs/2004.00502