傷害性詞語: 量化臨床上下文詞語嵌入中的偏見(CS CL)
- 2020 年 3 月 26 日
- 筆記
在這項工作中,我們研究了嵌入在多大程度上可能以不同的方式對邊緣化人群進行編碼,以及這是如何導致偏見的持續存在和臨床任務表現的惡化。 我們根據MIMIC-III 醫院的數據集,對深度嵌入模型(BERT)進行預先訓練,並用兩種方法對潛在的差異進行量化。 首先,我們識別危險的潛在關係,所捕獲的上下文詞嵌入使用填補空白的方法,文字來自真實的臨床記錄和日誌概率偏差評分量化。 第二,我們評估超過50個下游臨床預測任務的公平性的不同定義的性能差距,包括急性和慢性疾病的檢測。 我們發現從 BERT表徵訓練出來的分類器在表現上有統計學意義上的顯著差異,在性別、語言、種族和保險狀況方面往往偏向於大多數人群。 最後,我們探討了在上下文字嵌入中使用對抗性消偏來模糊子群信息的缺點,並推薦了這種深嵌入模型在臨床應用中的最佳實踐。
原文題目:Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings
原文:In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.
原文作者: Amy X. Lu
原文地址:https://arxiv.org/abs/2003.11515