通过标定以提高知识图嵌入的效用(CS CL)

本文提出了一种嵌入知识图实体和关系的机器学习模型,以实现预测未知三元组的目标,这是一项重要的任务,因为大多数知识图本质上是不完全的。我们认为,虽然使用嵌入技术的离线路径预测精度在基准数据集上一直在稳步提高,但这种嵌入模型在现实世界知识图完成任务中的实际效用有限,因为它们在预测何时应该被接受或被信任尚不清楚。为此,我们提出标定知识图嵌入模型,以输出预测三元组的可靠置信度估计。在众包实验中,我们证明了校准的置信度分数可以使知识图形嵌入对于知识图形完成任务的执行者和数据注释者更有用。我们还从评估任务中发布了两个资源: FB15K 基准的丰富版本和从Wikidata中提取的新知识图数据集。

原文题目:Improving the Utility of Knowledge Graph Embeddings with Calibration

原文:This paper addresses machine learning models that embed knowledge graph entities and relationships toward the goal of predicting unseen triples, which is an important task because most knowledge graphs are by nature incomplete. We posit that while offline link prediction accuracy using embeddings has been steadily improving on benchmark datasets, such embedding models have limited practical utility in real-world knowledge graph completion tasks because it is not clear when their predictions should be accepted or trusted. To this end, we propose to calibrate knowledge graph embedding models to output reliable confidence estimates for predicted triples. In crowdsourcing experiments, we demonstrate that calibrated confidence scores can make knowledge graph embeddings more useful to practitioners and data annotators in knowledge graph completion tasks. We also release two resources from our evaluation tasks: An enriched version of the FB15K benchmark and a new knowledge graph dataset extracted from Wikidata.

原文作者:Tara Safavi

原文地址:https://arxiv.org/abs/2004.01168