Abstract
Knowledge Graph Completion (KGC) has emerged as a promising solution to address the issue of incompleteness within Knowledge Graphs (KGs). Traditional KGC research primarily centers on triple classification and link prediction. Nevertheless, we contend that these tasks do not align well with real-world scenarios and merely serve as surrogate benchmarks. In this paper, we investigate three crucial processes relevant to real-world construction scenarios: (a) the verification process, which arises from the necessity and limitations of human verifiers; (b) the mining process, which identifies the most promising candidates for verification; and (c) the training process, which harnesses verified data for subsequent utilization; in order to achieve a transition toward more realistic challenges. By integrating these three processes, we introduce the Progressive Knowledge Graph Completion (PKGC) task, which simulates the gradual completion of KGs in real-world scenarios. Furthermore, to expedite PKGC processing, we propose two acceleration modules: Optimized Top-$k$ algorithm and Semantic Validity Filter. These modules significantly enhance the efficiency of the mining procedure. Our experiments demonstrate that performance in link prediction does not accurately reflect performance in PKGC. A more in-depth analysis reveals the key factors influencing the results and provides potential directions for future research.
Abstract (translated)
知识图谱完成(KGC)作为一种解决知识图谱(KG)中不完整性问题的有益解决方案,已经引起了研究人员的关注。传统的KGC研究主要集中在三元分类和链接预测。然而,我们认为这些任务并不符合现实世界的场景,仅仅作为替代指标。在本文中,我们研究了与现实世界构建场景相关的三个关键过程:(a)验证过程,这是由于人类验证者必要性和限制而产生的;(b)挖掘过程,它确定了最有前途的验证候选者;(c)训练过程,它利用验证数据进行后续利用。为了实现更真实的挑战,我们将这三个过程整合起来,引入了渐进式知识图谱完成(PKGC)任务,该任务在现实世界的场景中模拟KG的逐步完成。此外,为了加速PKGC处理,我们提出了两个加速模块:优化前k个算法和语义有效性过滤器。这些模块显著提高了挖掘过程的效率。我们的实验结果表明,链接预测的性能并不能准确反映PKGC的性能。更详细的分析揭示了影响结果的关键因素,并为未来的研究提供了潜在方向。
URL
https://arxiv.org/abs/2404.09897