Abstract
We consider the problem of finding plausible knowledge that is missing from a given ontology, as a generalisation of the well-studied taxonomy expansion task. One line of work treats this task as a Natural Language Inference (NLI) problem, thus relying on the knowledge captured by language models to identify the missing knowledge. Another line of work uses concept embeddings to identify what different concepts have in common, taking inspiration from cognitive models for category based induction. These two approaches are intuitively complementary, but their effectiveness has not yet been compared. In this paper, we introduce a benchmark for evaluating ontology completion methods and thoroughly analyse the strengths and weaknesses of both approaches. We find that both approaches are indeed complementary, with hybrid strategies achieving the best overall results. We also find that the task is highly challenging for Large Language Models, even after fine-tuning.
Abstract (translated)
我们将找到一个合理的知识,该知识从给定的本体中缺失,作为研究税目扩展任务的一般化。一种研究方法将这个问题视为自然语言推理(NLI)问题,因此它依赖于语言模型捕获到的知识来确定缺失的知识。另一种方法使用概念嵌入来确定不同概念之间的共同点,并从基于类别的归纳模型中获得灵感。这两种方法在直觉上是互补的,但它们的有效性尚未进行比较。在本文中,我们引入了一个评估本体完成度方法的基准,并深入分析两种方法的优缺点。我们发现,两种方法确实都是互补的,而混合策略实现的最佳整体结果。我们还发现,对于大型语言模型来说,即使经过微调,这个任务仍然具有很高难度。
URL
https://arxiv.org/abs/2403.17216