通過標定以提高知識圖嵌入的效用(CS CL)
- 2020 年 4 月 6 日
- 筆記
本文提出了一種嵌入知識圖實體和關係的機器學習模型,以實現預測未知三元組的目標,這是一項重要的任務,因為大多數知識圖本質上是不完全的。我們認為,雖然使用嵌入技術的離線路徑預測精度在基準數據集上一直在穩步提高,但這種嵌入模型在現實世界知識圖完成任務中的實際效用有限,因為它們在預測何時應該被接受或被信任尚不清楚。為此,我們提出標定知識圖嵌入模型,以輸出預測三元組的可靠置信度估計。在眾包實驗中,我們證明了校準的置信度分數可以使知識圖形嵌入對於知識圖形完成任務的執行者和數據注釋者更有用。我們還從評估任務中發布了兩個資源: FB15K 基準的豐富版本和從Wikidata中提取的新知識圖數據集。
原文題目:Improving the Utility of Knowledge Graph Embeddings with Calibration
原文:This paper addresses machine learning models that embed knowledge graph entities and relationships toward the goal of predicting unseen triples, which is an important task because most knowledge graphs are by nature incomplete. We posit that while offline link prediction accuracy using embeddings has been steadily improving on benchmark datasets, such embedding models have limited practical utility in real-world knowledge graph completion tasks because it is not clear when their predictions should be accepted or trusted. To this end, we propose to calibrate knowledge graph embedding models to output reliable confidence estimates for predicted triples. In crowdsourcing experiments, we demonstrate that calibrated confidence scores can make knowledge graph embeddings more useful to practitioners and data annotators in knowledge graph completion tasks. We also release two resources from our evaluation tasks: An enriched version of the FB15K benchmark and a new knowledge graph dataset extracted from Wikidata.
原文作者:Tara Safavi
原文地址:https://arxiv.org/abs/2004.01168