Paper Reading AI Learner

Large language models as oracles for instantiating ontologies with domain-specific knowledge

2024-04-05 14:04:07
Giovanni Ciatto, Andrea Agiollo, Matteo Magnini, Andrea Omicini

Abstract

Background. Endowing intelligent systems with semantic data commonly requires designing and instantiating ontologies with domain-specific knowledge. Especially in the early phases, those activities are typically performed manually by human experts possibly leveraging on their own experience. The resulting process is therefore time-consuming, error-prone, and often biased by the personal background of the ontology designer. Objective. To mitigate that issue, we propose a novel domain-independent approach to automatically instantiate ontologies with domain-specific knowledge, by leveraging on large language models (LLMs) as oracles. Method. Starting from (i) an initial schema composed by inter-related classes andproperties and (ii) a set of query templates, our method queries the LLM multi- ple times, and generates instances for both classes and properties from its replies. Thus, the ontology is automatically filled with domain-specific knowledge, compliant to the initial schema. As a result, the ontology is quickly and automatically enriched with manifold instances, which experts may consider to keep, adjust, discard, or complement according to their own needs and expertise. Contribution. We formalise our method in general way and instantiate it over various LLMs, as well as on a concrete case study. We report experiments rooted in the nutritional domain where an ontology of food meals and their ingredients is semi-automatically instantiated from scratch, starting from a categorisation of meals and their relationships. There, we analyse the quality of the generated ontologies and compare ontologies attained by exploiting different LLMs. Finally, we provide a SWOT analysis of the proposed method.

Abstract (translated)

背景。为使智能系统获得语义数据,通常需要根据领域专业知识设计并实例化本领域的知识图谱。尤其是在最初阶段,这些活动通常由人类专家手动执行,可能还会利用他们自己的经验。因此, resulting process is therefore time-consuming, error-prone, and often biased by the personal background of the ontology designer. 目标。为了减轻这个问题,我们提出了一种新的、领域无关的方法来自动实例化具有领域特定知识的语义数据,通过利用大型语言模型(LLMs)作为预言者。方法。从(i)一个由相关类和属性组成的初始模式和(ii)一组查询模板开始,我们的方法多次查询LLM,并在其回复中生成类和属性的实例。因此,本语义图自动充满了领域特定知识,符合初始模式。因此,本语义图可以根据专家的需要和专业知识自动丰富多样实例,这些实例可以被视为保留、调整或丢弃。贡献。我们以一般方式形式阐述我们的方法,并在各种LLM上实例化它,同时还在一个具体案例研究中实例化它。我们在营养领域进行实验,从对餐食及其关系的分类开始,一个从零开始的餐食图谱 semi-自动实例化。在那里,我们分析了生成的语义图的质量,并比较了利用不同LLM获得的语义图的质量。最后,我们提供了所提议方法的SWOT分析。

URL

https://arxiv.org/abs/2404.04108

PDF

https://arxiv.org/pdf/2404.04108.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot