NLP之keras中文文本分類系列演算法封裝,簡單易用(超詳細教程)

  • 2020 年 3 月 12 日
  • 筆記

中文長文本分類、短句子分類、多標籤分類、兩句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字詞句向量嵌入層(embeddings)和網路層(graph)構建基類,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 膠囊網路-CapsuleNet, Transformer-encode, Seq2seq, SWEM

01

keras_textclassification

02

項目說明

  1. 構建了base基類(網路(graph)、向量嵌入(詞、字、句子embedding)),後邊的具體模型繼承它們,程式碼簡單
  2. keras_layers存放一些常用的layer, conf存放項目數據、模型的地址, data存放數據和語料, data_preprocess為數據預處理模組,

03

模型與論文paper題與地址

  • FastText: Bag of Tricks for Efficient Text Classification
  • TextCNN:Convolutional Neural Networks for Sentence Classification
  • charCNN-kim:Character-Aware Neural Language Models
  • charCNN-zhang: Character-level Convolutional Networks for Text Classification
  • TextRNN:Recurrent Neural Network for Text Classification with Multi-Task Learning
  • RCNN:Recurrent Convolutional Neural Networks for Text Classification
  • DCNN: A Convolutional Neural Network for Modelling Sentences
  • DPCNN: Deep Pyramid Convolutional Neural Networks for Text Categorization
  • VDCNN: Very Deep Convolutional Networks
  • CRNN: A C-LSTM Neural Network for Text Classification
  • DeepMoji: Using millions of emojio ccurrences to learn any-domain represent ations for detecting sentiment, emotion and sarcasm
  • SelfAttention: Attention Is All You Need
  • HAN: Hierarchical Attention Networks for Document Classification
  • CapsuleNet: Dynamic Routing Between Capsules
  • Transformer(encode or decode): Attention Is All You Need
  • Bert: BERT: Pre-trainingofDeepBidirectionalTransformersfor LanguageUnderstanding
  • Xlnet: XLNet: Generalized Autoregressive Pretraining for Language Understanding
  • Albert: ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS

04

參考/感謝

  • 文本分類項目: https://github.com/mosu027/TextClassification
  • 文本分類看山杯: https://github.com/brightmart/text_classification
  • Kashgari項目: https://github.com/BrikerMan/Kashgari
  • 文本分類Ipty : https://github.com/lpty/classifier
  • keras文本分類: https://github.com/ShawnyXiao/TextClassification-Keras
  • keras文本分類: https://github.com/AlexYangLi/TextClassification
  • CapsuleNet模型: https://github.com/bojone/Capsule
  • transformer模型: https://github.com/CyberZHG/keras-transformer
  • keras_albert_model: https://github.com/TinkerMob/keras_albert_model

05

訓練簡單調用:

06

Train&Usage(調用)

07

Predict&Usage(調用)