面向网络安全的智能命名实体识别的深度学习方法(CS CL)
- 2020 年 4 月 6 日
- 筆記
近年来,以非结构化文本形式生成的网络安全数据,例如,社交媒体资源、博客、文章等,数量异常增加。命名实体识别(NER)是将这个非结构化数据转换成结构化数据的第一步,这些数据可以被很多应用程序使用。现有的网络安全数据处理方法基于规则和语言特征,提出了一种基于深度学习的嵌入条件随机的方法。 为了找到最优的体系结构对几种DL体系结构进行了评估。双向门控循环单元(Bi-GRU)、卷积神经网络(CNN)和CRF的组合在一个公开可用的基准数据集上比其他各种DL框架表现得更好。这可能是由于双向结构按顺序保留了与未来词和先前词相关的特征。
原文题目Deep Learning Approach for Intelligent Named Entity Recognition of Cyber Security
原文:In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data are based on rules and linguistic characteristics. A Deep Learning (DL) based approach embedded with Conditional Random Fields (CRFs) is proposed in this paper. Several DL architectures are evaluated to find the most optimal architecture. The combination of Bidirectional Gated Recurrent Unit (Bi-GRU), Convolutional Neural Network (CNN), and CRF performed better compared to various other DL frameworks on a publicly available benchmark dataset. This may be due to the reason that the bidirectional structures preserve the features related to the future and previous words in a sequence.
原文作者:Vinayakumar R
原文地址:https://arxiv.org/abs/2004.00502