Adapting Neural Link Predictors for Complex Query Answering

Abstract
Abstract (translated)
URL
PDF

Abstract

Answering complex queries on incomplete knowledge graphs is a challenging task where a model needs to answer complex logical queries in the presence of missing knowledge. Recently, Arakelyan et al. (2021); Minervini et al. (2022) showed that neural link predictors could also be used for answering complex queries: their Continuous Query Decomposition (CQD) method works by decomposing complex queries into atomic sub-queries, answers them using neural link predictors and aggregates their scores via t-norms for ranking the answers to each complex query. However, CQD does not handle negations and only uses the training signal from atomic training queries: neural link prediction scores are not calibrated to interact together via fuzzy logic t-norms during complex query answering. In this work, we propose to address this problem by training a parameter-efficient score adaptation model to re-calibrate neural link prediction scores: this new component is trained on complex queries by back-propagating through the complex query-answering process. Our method, CQD$^{A}$, produces significantly more accurate results than current state-of-the-art methods, improving from $34.4$ to $35.1$ Mean Reciprocal Rank values averaged across all datasets and query types while using $\leq 35\%$ of the available training query types. We further show that CQD$^{A}$ is data-efficient, achieving competitive results with only $1\%$ of the training data, and robust in out-of-domain evaluations.

Abstract (translated)

在不完整的知识图谱上回答复杂的查询是一项具有挑战性的任务,需要在缺少知识的情况下回答复杂的逻辑查询。最近,Ar卡拉扬(2021)和Minervini(2022)研究表明,神经连接预测器也可以用于回答复杂的查询:他们的连续查询分解方法(CQD)通过将复杂的查询分解成原子子查询,使用神经连接预测器回答它们,并通过tnorms对每个复杂查询的答案进行排名。然而,CQD并不处理否定词,仅使用原子训练查询的训练信号:在复杂的查询回答过程中,神经连接预测器 scores 没有校准以通过模糊逻辑的t-norms相互作用。在这项工作中,我们提议通过训练参数高效的得分适应模型来重新校准神经连接预测器 scores:这个新组件通过在复杂的查询回答过程中反向传播训练信号来训练。我们的方法(CQD$^{A}$)比当前最先进的方法产生更准确的结果,平均提高从34.4到35.1的相互转换排名值,尽管使用了仅占可用训练查询类型的35%的情况下。我们还表明,CQD^{A}是一种数据高效的方法,仅使用训练数据中的1%就能获得竞争结果,并在跨领域评估中表现出鲁棒性。

URL

https://arxiv.org/abs/2301.12313

PDF

https://arxiv.org/pdf/2301.12313.pdf