Downstream Model Design of Pre-trained Language Model for Relation Extraction Task 論文筆記

2021 年 1 月 31 日
AI
自然語言處理

Background

前情概要

Overlapping relations
- Normal: All relations in the sample is normal.
- EPO: there are at least two relations overlapped in the same entity-pair in the sample.
- SEO: there are at least two relations sharing a single entity.
Multiple relations
- Single: only one relation appears in the sample
- Double: two
- Multiple: no less than three
論文動機
1. 當前的PLMs沒有專門針對關係抽取這個任務設置
2. long-distance relation/single sentence with multiple relations/overlapped relations三大問題現有的方法都沒有很好的解決。
論文貢獻
1. 使用預訓練語言模型替代之前的傳統的編碼器。
2. 使用參數化非對稱核內積矩陣(A parameterized asymmetric kernel inner product matrix)計算頭實體和尾實體embedding（the head and tail embeddings of each token in a sequence），此矩陣可以被視為 the tendency score to indicate a certain relation.
3. 使用Sigmoid分類器取代Softmax分類器，使用average probability as the final probability.這樣做好的好處在於可以使得預測在同一實體間的多種關係。
4. 兩個創新點（Network structure; Loss function）

Model

Encoder:得到給定文本下的三種embedding.(==經過微調訓練後，具有較高注意力得分的單詞在某種程度上對應於某種關係的謂詞==)

尾實體計算：E_{p}=Transformer(E_{w})，

E_{p} is the last output vector,E_{w}是BERT的倒數第二層 output vector。

頭實體計算：E_{b}=E_{w}+E_{a}

E_{a}: BERT’s CLS embedding is added in order to capture the overall context information.

Relation Computing Layer

計算E_{b}和E_{p}之間的相似度。

\boldsymbol{S}_{i}=F_{i}\left(\boldsymbol{E}_{b}, \boldsymbol{E}_{p}\right)

其中，F_{i}(\boldsymbol{X}, \boldsymbol{Y})=\boldsymbol{X} \boldsymbol{W}_{h i} \cdot\left(\boldsymbol{Y} \boldsymbol{W}_{t i}\right)^{T}

W_{hi}和W_{ti}分別是頭實體和尾實體在第i種關係下的transformation matrices.

S_{i}是一個方形矩陣，可以被視為非歸一化概率分數 between all tokens in i-th relation，也就是說S_{i mn}表示==在位置（m,n）處的token存在關係i的可能性==

來整個歸一化~：

Loss Calculation

NOTE：上述的P_{i}所描述的關係是between tokens,not entities.使用entity-mask matrix解決這個問題。

假設文本T中所有的實體對為集合S，\mathbb{S}=\{(x, y)\}. #排列組合#

我們構造 A mask matrix \boldsymbol{M}, \boldsymbol{M} \in \mathbb{R}^{l \times l}, l是 text length l. (B_{x},E_{x})是實體x位置index開始和結束。==Use this mask matrix to reserve the predicted probabilities of every entity pair from \boldsymbol{P_{i}} .==