芬兰语建模与深层变压器模型（CS SD）

2020 年 3 月 27 日
筆記

在LSTM被认为是主导模型体系结构之后的很长一段时间，转换器在语言建模中占据了中心舞台。在这个课题中，我们研究了BRET转换器结构和XL转换器结构在语言建模任务中的性能。BERT获得了14.5的伪复杂度评分，这是我们目前所知道的第一个此类的测量。XL模型的伪复杂度分数提高到73.58，比LSTM模型提高了27%。

原文题目：Finnish Language Modeling with Deep Transformer Models

原文：Transformers have recently taken the center stage in language modeling after LSTM's were considered the dominant model architecture for a long time. In this project, we investigate the performance of the Transformer architectures-BERT and Transformer-XL for the language modeling task. We use a sub-word model setting with the Finnish language and compare it to the previous State of the art (SOTA) LSTM model. BERT achieves a pseudo-perplexity score of 14.5, which is the first such measure achieved as far as we know. Transformer-XL improves upon the perplexity score to 73.58 which is 27% better than the LSTM model.

原文作者：Abhilash Jain

原文地址：https://arxiv.org/abs/2003.11562

芬兰语建模与深层变压器模型（CS SD）.pdf

芬兰语建模与深层变压器模型（CS SD）

VirMach 便宜 VPS

QNews

芬兰语建模与深层变压器模型（CS SD）

分享此文：

Related Posts

第18讲 for循环优化：嵌套的for循环

EMNLP 2019 | 常识信息增强的事件表示学习

电力系统建模与分析的符号-数值混合库( CS SY)

用于说话人识别的度量学习的防御（CS SD）

VirMach 便宜 VPS

QNews

熱門搜尋