Despite widespread applications of knowledge graphs (KGs) in various tasks such as question answering and intelligent conversational systems, existing KGs face two major challenges: information granularity and deficiency in timeliness. These hinder considerably the retrieval and analysis of in-context, fine-grained, and up-to-date knowledge from KGs, particularly in highly specialized themes (e.g., specialized scientific research) and rapidly evolving contexts (e.g., breaking news or disaster tracking). To tackle such challenges, we propose a theme-specific knowledge graph (i.e., ThemeKG), a KG constructed from a theme-specific corpus, and design an unsupervised framework for ThemeKG construction (named TKGCon). The framework takes raw theme-specific corpus and generates a high-quality KG that includes salient entities and relations under the theme. Specifically, we start with an entity ontology of the theme from Wikipedia, based on which we then generate candidate relations by Large Language Models (LLMs) to construct a relation ontology. To parse the documents from the theme corpus, we first map the extracted entity pairs to the ontology and retrieve the candidate relations. Finally, we incorporate the context and ontology to consolidate the relations for entity pairs. We observe that directly prompting GPT-4 for theme-specific KG leads to inaccurate entities (such as "two main types" as one entity in the query result) and unclear (such as "is", "has") or wrong relations (such as "have due to", "to start"). In contrast, by constructing the theme-specific KG step by step, our model outperforms GPT-4 and could consistently identify accurate entities and relations. Experimental results also show that our framework excels in evaluations compared with various KG construction baselines.
尽管知识图谱(KGs)在各种任务中的广泛应用,如问答和智能对话系统,现有KG面临两个主要挑战:信息粒度和时间不足。这些阻碍了从KGs中检索和分析上下文、细粒度和最新知识的能力,特别是在高度专业化的主题(例如,专业科学研究)和快速变化的环境(例如,新闻或灾害跟踪)中。为了应对这些挑战,我们提出了一个主题特定知识图(即 ThemeKG),一个基于主题特定语料库的知识图谱,并设计了用于 ThemeKG 构建的无监督框架(名为 TKGCon)。该框架从主题特定语料库中提取原始主题,然后通过大型语言模型(LLMs)生成候选关系,构建主题关系本体。为了解析主题语料库中的文档,我们首先将提取到的实体对映射到语料库,并检索候选关系。最后,我们将上下文和本体整合用于关系匹配。我们观察到,直接使用 GPT-4 生成主题特定 KG会导致不准确实体(例如查询结果中的“两个主要类型”作为一个实体),以及不清晰或错误的關係(例如“由於”或“开始于”)。相比之下,通过逐步构建主题特定 KG,我们的模型在比较各种 KG 建设基线方面表现出优异性能。实验结果还显示,我们的框架在各种 KG 建设基线上的评估中表现出色。
https://arxiv.org/abs/2404.19146
Ethical reasoning is a crucial skill for Large Language Models (LLMs). However, moral values are not universal, but rather influenced by language and culture. This paper explores how three prominent LLMs -- GPT-4, ChatGPT, and Llama2-70B-Chat -- perform ethical reasoning in different languages and if their moral judgement depend on the language in which they are prompted. We extend the study of ethical reasoning of LLMs by Rao et al. (2023) to a multilingual setup following their framework of probing LLMs with ethical dilemmas and policies from three branches of normative ethics: deontology, virtue, and consequentialism. We experiment with six languages: English, Spanish, Russian, Chinese, Hindi, and Swahili. We find that GPT-4 is the most consistent and unbiased ethical reasoner across languages, while ChatGPT and Llama2-70B-Chat show significant moral value bias when we move to languages other than English. Interestingly, the nature of this bias significantly vary across languages for all LLMs, including GPT-4.
道德推理是一个关键的技能,对于大型语言模型(LLMs)。然而,道德价值观并不是普遍的,而是受到语言和文化的影响。本文探讨了三种显著的LLM——GPT-4、ChatGPT和Llama2-70B-Chat——在不同语言下的道德推理表现以及它们是否在所提示的语言中进行道德判断。我们在Rao等人(2023)的研究框架下,将道德推理扩展到多语言环境,遵循他们关于通过三个道德哲学分支(义务伦理学、美德伦理学和后果主义)探究LLMs的框架。我们对六种语言进行了实验:英语、西班牙语、俄语、汉语、印地语和斯瓦希里语。我们发现,GPT-4是在所有LLM中表现最一致和不偏见的道德推理者,而ChatGPT和Llama2-70B-Chat在除英语外的其他语言中表现出显著的道德价值偏见。有趣的是,这种偏见在所有LLM中表现出显著差异,包括GPT-4。
https://arxiv.org/abs/2404.18460
The Common Core Ontologies (CCO) are designed as a mid-level ontology suite that extends the Basic Formal Ontology. CCO has since been increasingly adopted by a broad group of users and applications and is proposed as the first standard mid-level ontology. Despite these successes, documentation of the contents and design patterns of the CCO has been comparatively minimal. This paper is a step toward providing enhanced documentation for the mid-level ontology suite through a discussion of the contents of the eleven ontologies that collectively comprise the Common Core Ontology suite.
Common Core Ontologies(CCO)被设计为中水平语义网套件,扩展了基本形式语义网。自CCO被越来越多的用户和应用程序采用以来,对CCO的文档化和设计模式的描述相对较少。本文是通过讨论组成共同核心语义网套件的11个语义网,为中水平语义网套件提供增强文档的一个步骤。
https://arxiv.org/abs/2404.17758
Mid-level ontologies are used to integrate terminologies and data across disparate domains. There are, however, no clear, defensible criteria for determining whether a given ontology should count as mid-level, because we lack a rigorous characterization of what the middle level of generality is supposed to contain. Attempts to provide such a characterization have failed, we believe, because they have focused on the goal of specifying what is characteristic of those single ontologies that have been advanced as mid-level ontologies. Unfortunately, single ontologies of this sort are generally a mixture of top- and mid-level, and sometimes even of domain-level terms. To gain clarity, we aim to specify the necessary and sufficient conditions for a collection of one or more ontologies to inhabit what we call a mid-level architecture.
中等水平 ontologies用于将术语和数据集成到不同的领域中。然而,确定一个给定 ontology 是否应被视为中等水平并没有明确的、可防御的准则,因为我们在对中等水平通用性的定义进行深入描述时缺乏严谨的阐述。我们试图提供这样的描述,但我们认为,这是因为他们集中于指定作为先进中等水平 ontologies的特征。然而,这类单个 ontology 通常是一个包含 top- 和 mid- 级别术语的混合物,有时甚至包含领域级别术语。为了获得清晰,我们旨在确定一个或多个 ontology 集合是否可以占据我们称之为中等水平架构的东西。
https://arxiv.org/abs/2404.17757
Capability ontologies are increasingly used to model functionalities of systems or machines. The creation of such ontological models with all properties and constraints of capabilities is very complex and can only be done by ontology experts. However, Large Language Models (LLMs) have shown that they can generate machine-interpretable models from natural language text input and thus support engineers / ontology experts. Therefore, this paper investigates how LLMs can be used to create capability ontologies. We present a study with a series of experiments in which capabilities with varying complexities are generated using different prompting techniques and with different LLMs. Errors in the generated ontologies are recorded and compared. To analyze the quality of the generated ontologies, a semi-automated approach based on RDF syntax checking, OWL reasoning, and SHACL constraints is used. The results of this study are very promising because even for complex capabilities, the generated ontologies are almost free of errors.
能力元理模型越来越多地用于系统或机器的功能建模。创建具有所有能力和约束的所有属性与能力元理模型非常复杂,只能由元理专家完成。然而,大型语言模型(LLMs)已经表明,它们可以从自然语言文本输入中生成机器可解释的模型,从而支持工程师/元理专家。因此,本文研究了LLMs如何用于创建能力元理模型。我们提出了一个系列实验来研究使用不同提示技术和不同LLM生成具有不同复杂性的能力元理模型。记录生成的元理模型的错误并进行了比较。为了分析生成的元理模型的质量,采用基于RDF语法检查、OWL推理和SHACL约束的半自动方法。本研究的结果非常有前途,因为即使对于复杂的能力,生成的元理模型也几乎没有错误。
https://arxiv.org/abs/2404.17524
Objective: Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants. Although leveraging electronic health records (EHR) for recruitment has gained popularity, the complex nature of unstructured medical texts presents challenges in efficiently identifying participants. Natural Language Processing (NLP) techniques have emerged as a solution with a recent focus on transformer models. In this study, we aimed to evaluate the performance of a prompt-based large language model for the cohort selection task from unstructured medical notes collected in the EHR. Methods: To process the medical records, we selected the most related sentences of the records to the eligibility criteria needed for the trial. The SNOMED CT concepts related to each eligibility criterion were collected. Medical records were also annotated with MedCAT based on the SNOMED CT ontology. Annotated sentences including concepts matched with the criteria-relevant terms were extracted. A prompt-based large language model (Generative Pre-trained Transformer (GPT) in this study) was then used with the extracted sentences as the training set. To assess its effectiveness, we evaluated the model's performance using the dataset from the 2018 n2c2 challenge, which aimed to classify medical records of 311 patients based on 13 eligibility criteria through NLP techniques. Results: Our proposed model showed the overall micro and macro F measures of 0.9061 and 0.8060 which were among the highest scores achieved by the experiments performed with this dataset. Conclusion: The application of a prompt-based large language model in this study to classify patients based on eligibility criteria received promising scores. Besides, we proposed a method of extractive summarization with the aid of SNOMED CT ontology that can be also applied to other medical texts.
目标:临床试验对于推动制药干预至关重要,但在选择合适参与者方面存在瓶颈。尽管利用电子病历(EHR)进行招募的做法已经受到欢迎,但非结构化医疗文本复杂的 nature 提出了有效地识别参与者的挑战。自然语言处理(NLP)技术在最近关注于Transformer模型方面成为了解决方案。在这项研究中,我们旨在评估基于提示的大型语言模型在从EHR中收集的非结构化医疗文本的队列选择任务中的性能。方法:为了处理医学记录,我们选择了与需要试验资格标准相关的最相关的句子。收集了与每个资格标准相关的SNOMED CT概念。同时,根据SNOMED CT语义数据库对医学记录进行了注释。包括与标准匹配的概念的注解句子被提取出来。然后,使用基于提示的大型语言模型(本研究中使用的是Generative Pre-trained Transformer(GPT))对提取的句子进行训练。为了评估其效果,我们使用2018 n2c2挑战的数据集来评估模型的性能,该数据集旨在根据13个资格标准对311名患者的医疗记录进行分类。结果:与该数据集上进行的实验相比,我们提出的模型在整体微和宏观F分数方面得分最高,为0.9061和0.8060,这是该数据集中实现的最高分数。结论:将提示式大型语言模型应用于根据资格标准对患者进行分类,在本研究中得到了有前景的分数。此外,我们还提出了使用SNOMED CT语义数据库的提取式总结方法,该方法也可以应用于其他医学文本。
https://arxiv.org/abs/2404.16198
The success of contrastive language-image pretraining (CLIP) relies on the supervision from the pairing between images and captions, which tends to be noisy in web-crawled data. We present Mixture of Data Experts (MoDE) and learn a system of CLIP data experts via clustering. Each data expert is trained on one data cluster, being less sensitive to false negative noises in other clusters. At inference time, we ensemble their outputs by applying weights determined through the correlation between task metadata and cluster conditions. To estimate the correlation precisely, the samples in one cluster should be semantically similar, but the number of data experts should still be reasonable for training and inference. As such, we consider the ontology in human language and propose to use fine-grained cluster centers to represent each data expert at a coarse-grained level. Experimental studies show that four CLIP data experts on ViT-B/16 outperform the ViT-L/14 by OpenAI CLIP and OpenCLIP on zero-shot image classification but with less ($<$35\%) training cost. Meanwhile, MoDE can train all data expert asynchronously and can flexibly include new data experts. The code is available at this https URL.
对比性语言-图像预训练(CLIP)的成功取决于图像与摘要之间的配对监督,而这类数据往往存在噪声。我们提出了混合数据专家(MoDE)方法并通过聚类学习系统。每个数据专家在一个数据聚类上进行训练,对其他聚类的虚假负噪声更不敏感。在推理时,我们通过任务元数据与聚类条件的关联来应用权重。为了精确估计相关性,一个聚类的样本应该在语义上相似,但数据专家的数量仍应保持在训练和推理的合理范围内。因此,我们在人类语言的语义层次上考虑元数据,并建议在粗粒度层面使用细粒度聚类中心来表示每个数据专家。实验研究表明,在ViT-B/16上,四个CLIP数据专家超过了ViT-L/14上的OpenAI CLIP和OpenCLIP在零散图像分类上的表现,但训练成本较低(<35%)。与此同时,MoDE可以异步训练所有数据专家,并可以灵活地包括新的数据专家。代码可在此处下载:https://thisurl.com
https://arxiv.org/abs/2404.16030
Expert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) methods such as (large) language models that can assist enzyme curation. EnzChemRED consists of 1,210 expert curated PubMed abstracts in which enzymes and the chemical reactions they catalyze are annotated using identifiers from the UniProt Knowledgebase (UniProtKB) and the ontology of Chemical Entities of Biological Interest (ChEBI). We show that fine-tuning pre-trained language models with EnzChemRED can significantly boost their ability to identify mentions of proteins and chemicals in text (Named Entity Recognition, or NER) and to extract the chemical conversions in which they participate (Relation Extraction, or RE), with average F1 score of 86.30% for NER, 86.66% for RE for chemical conversion pairs, and 83.79% for RE for chemical conversion pairs and linked enzymes. We combine the best performing methods after fine-tuning using EnzChemRED to create an end-to-end pipeline for knowledge extraction from text and apply this to abstracts at PubMed scale to create a draft map of enzyme functions in literature to guide curation efforts in UniProtKB and the reaction knowledgebase Rhea. The EnzChemRED corpus is freely available at this https URL.
专家策展对于从FAIR开放知识库中捕获酶功能知识至关重要,但无法跟上新发现和新出版物的发展速度。在这项工作中,我们提出了EnzChemRED,Enzyme Chemistry Relation Extraction Dataset的训练和基准数据集,以支持开发自然语言处理(NLP)方法,如(大型)语言模型,以协助酶策展。EnzChemRED由1,210个专家编写的PubMed摘要组成,其中酶及其催化的化学反应使用来自UniProt知识库(UniProtKB)和化学生物实体(ChEBI)的标识符进行注释。我们证明了使用EnzChemRED对预训练语言模型进行微调可以显著提高其在文本(命名实体识别,NER)中识别蛋白质和化学物质的提及能力以及提取它们参与的化学转换(关系提取,RE)能力,平均F1分数为86.30% for NER,86.66% for RE for chemical conversion pairs,83.79% for RE for chemical conversion pairs and linked enzymes。我们使用EnzChemRED中表现最好的方法对文本进行微调,创建了从文本到摘要的端到端管道,并将此应用于PubMed大小的摘要以创建酶功能文献的初步映射,以指导在UniProtKB和反应知识库Rhea中的策展工作。EnzChemRED语料库可在此链接处免费获取:https://www.ncbi.nlm.nih.gov/25962541
https://arxiv.org/abs/2404.14209
Ontology matching is defined as finding a relationship or correspondence between two or more entities in two or more ontologies. To solve the interoperability problem of the domain ontologies, semantically similar entities in these ontologies must be found and aligned before merging them. GraphMatcher, developed in this study, is an ontology matching system using a graph attention approach to compute higher-level representation of a class together with its surrounding terms. The GraphMatcher has obtained remarkable results in in the Ontology Alignment Evaluation Initiative (OAEI) 2022 conference track. Its codes are available at ~\url{this https URL}.
语义匹配是一种在两个或多个语义网之间查找关系或对应关系的任务。为了解决领域语义网之间的互操作性问题,本研究开发了一种基于图注意力的语义匹配系统,用于计算类及其周围术语的高级表示。GraphMatcher在2022年Ontology Alignment Evaluation Initiative(OAEI)会议跟踪中取得了显著的成果。其代码可在此处下载:https://this https URL。
https://arxiv.org/abs/2404.14450
Adverse drug events (ADEs) significantly impact clinical research and public health, contributing to failures in clinical trials and leading to increased healthcare costs. The accurate prediction and management of ADEs are crucial for improving the development of safer, more effective medications, and enhancing patient outcomes. To support this effort, we introduce CT-ADE, a novel dataset compiled to enhance the predictive modeling of ADEs. Encompassing over 12,000 instances extracted from clinical trial results, the CT-ADE dataset integrates drug, patient population, and contextual information for multilabel ADE classification tasks in monopharmacy treatments, providing a comprehensive resource for developing advanced predictive models. To mirror the complex nature of ADEs, annotations are standardized at the system organ class level of the Medical Dictionary for Regulatory Activities (MedDRA) ontology. Preliminary analyses using baseline models have demonstrated promising results, achieving 73.33% F1 score and 81.54% balanced accuracy, highlighting CT-ADE's potential to advance ADE prediction. CT-ADE provides an essential tool for researchers aiming to leverage the power of artificial intelligence and machine learning to enhance patient safety and minimize the impact of ADEs on pharmaceutical research and development. Researchers interested in using the CT-ADE dataset can find all necessary resources at this https URL.
药物不良反应(ADEs)对临床研究和公共卫生产生重大影响,导致临床试验失败和医疗费用增加。准确预测和管理ADEs对提高更安全、更有效的药物开发至关重要。为了支持这一努力,我们引入了CT-ADE,一个专门为增强ADEs预测建模的新数据集。包含从临床试验结果中提取的超过12,000个实例,CT-ADE数据集整合了药物、患者人口和上下文信息,为多标签ADE分类任务提供了一个全面的资源,以开发高级预测模型。为了反映ADEs的复杂性,在MedDRA语义层的系统器官级别进行注释。使用基线模型进行初步分析已经取得了良好的成果,实现了73.33%的F1得分和81.54%的平衡准确率,突出了CT-ADE在提高ADE预测方面的潜力。CT-ADE为研究人员利用人工智能和机器学习加强患者安全并减轻ADEs对制药研究和开发产生影响提供了一个重要的工具。对使用CT-ADE数据集感兴趣的研究人员可以在该链接找到所有必要的资源。
https://arxiv.org/abs/2404.12827
Current open-domain neural semantics parsers show impressive performance. However, closer inspection of the symbolic meaning representations they produce reveals significant weaknesses: sometimes they tend to merely copy character sequences from the source text to form symbolic concepts, defaulting to the most frequent word sense based in the training distribution. By leveraging the hierarchical structure of a lexical ontology, we introduce a novel compositional symbolic representation for concepts based on their position in the taxonomical hierarchy. This representation provides richer semantic information and enhances interpretability. We introduce a neural "taxonomical" semantic parser to utilize this new representation system of predicates, and compare it with a standard neural semantic parser trained on the traditional meaning representation format, employing a novel challenge set and evaluation metric for evaluation. Our experimental findings demonstrate that the taxonomical model, trained on much richer and complex meaning representations, is slightly subordinate in performance to the traditional model using the standard metrics for evaluation, but outperforms it when dealing with out-of-vocabulary concepts. This finding is encouraging for research in computational semantics that aims to combine data-driven distributional meanings with knowledge-based symbolic representations.
目前公开领域的神经语义解析器表现出令人印象深刻的性能。然而,对其产生的符号意义表示的近距离观察揭示了显著的弱点:有时候它们倾向于仅仅从源文本中复制字符序列以形成符号概念,默认为基于训练分布中最常见单词意义的最频词汇。通过利用词汇本体的层次结构,我们引入了一种基于它们在分类层次结构中的位置的新组合符号表示概念。这种表示提供了更丰富的语义信息并提高了可解释性。我们引入了一个神经“语义分类”语义解析器,用于利用这种基于命题的新表示系统,并将其与使用传统意义表示格式训练的标准神经语义解析器进行比较。我们的实验结果表明,基于更丰富和复杂语义表示的语义模型在标准评估指标上的性能略微低于使用标准评估指标的传统模型,但在处理非词汇概念时表现优异。这一发现对于旨在将数据驱动的分布语义与知识驱动的符号表示相结合的计算语义研究来说是有益的。
https://arxiv.org/abs/2404.12698
Different entities with the same name can be difficult to distinguish. Handling confusing entity mentions is a crucial skill for language models (LMs). For example, given the question "Where was Michael Jordan educated?" and a set of documents discussing different people named Michael Jordan, can LMs distinguish entity mentions to generate a cohesive answer to the question? To test this ability, we introduce a new benchmark, AmbigDocs. By leveraging Wikipedia's disambiguation pages, we identify a set of documents, belonging to different entities who share an ambiguous name. From these documents, we generate questions containing an ambiguous name and their corresponding sets of answers. Our analysis reveals that current state-of-the-art models often yield ambiguous answers or incorrectly merge information belonging to different entities. We establish an ontology categorizing four types of incomplete answers and automatic evaluation metrics to identify such categories. We lay the foundation for future work on reasoning across multiple documents with ambiguous entities.
具有相同名称的不同实体可能很难区分。处理令人困惑的实体提及是语言模型(LMs)的一项关键技能。例如,给定问题“迈克尔·乔丹在哪里受教育?”以及一系列讨论不同名为迈克尔·乔丹的人的文件,LMs能否区分实体提及并生成针对问题的连贯答案?为了测试这种能力,我们引入了一个新的基准,AmbigDocs。通过利用维基百科的歧义页面,我们找到了一组属于不同实体的具有模糊名称的文档。从这些文档中,我们生成包含模糊名称和相关答案的问题。我们的分析显示,当前最先进的模型通常会产生模糊的答案或错误地合并来自不同实体的信息。我们建立了一个分类为四种不完整答案的元数据模型和自动评估指标,以识别这些类别。我们在跨多个具有模糊实体的文档之间进行推理的基础之上,为未来的研究工作奠定了基础。
https://arxiv.org/abs/2404.12447
We foresee robots that bootstrap knowledge representations and use them for classifying relevant situations and making decisions based on future observations. Particularly for assistive robots, the bootstrapping mechanism might be supervised by humans who should not repeat a training phase several times and should be able to refine the taught representation. We consider robots that bootstrap structured representations to classify some intelligible categories. Such a structure should be incrementally bootstrapped, i.e., without invalidating the identified category models when a new additional category is considered. To tackle this scenario, we presented the Scene Identification and Tagging (SIT) algorithm, which bootstraps structured knowledge representation in a crisp OWL-DL ontology. Over time, SIT bootstraps a graph representing scenes, sub-scenes and similar scenes. Then, SIT can classify new scenes within the bootstrapped graph through logic-based reasoning. However, SIT has issues with sensory data because its crisp implementation is not robust to perception noises. This paper presents a reformulation of SIT within the fuzzy domain, which exploits a fuzzy DL ontology to overcome the robustness issues. By comparing the performances of fuzzy and crisp implementations of SIT, we show that fuzzy SIT is robust, preserves the properties of its crisp formulation, and enhances the bootstrapped representations. On the contrary, the fuzzy implementation of SIT leads to less intelligible knowledge representations than the one bootstrapped in the crisp domain.
我们预计将出现能够引导知识表示的机器人,并将其用于分类相关情况并根据未来观察结果做出决策的机器人。特别是辅助机器人,引导机制可能由人类监督,他们不应该重复训练阶段多次,并且应该能够精炼所教授的表示。我们认为,引导结构化表示以分类一些可解释类别的机器人。这种结构应该通过逐步引导来进行,即在考虑新增类别时不会破坏已确定的类别模型。为解决这种情况,我们提出了Scene Identification and Tagging (SIT)算法,它在 crisp OWL-DL 上下文中引导结构化知识表示。随着时间的推移,SIT 通过基于逻辑推理绘制场景、子场景和类似场景的图。然后,SIT 通过逻辑推理对引导的图中的新场景进行分类。然而,SIT 在感官数据方面存在问题,因为其明确的实现对感知噪声不具有鲁棒性。本文在模糊领域对SIT进行了重新表述,利用模糊DL 上下文克服了鲁棒性问题。通过比较模糊和明确实现SIT的性能,我们证明了模糊SIT具有鲁棒性,保留了其明确的公式的性质,并增强了引导的表示。相反,模糊实现SIT导致生成的知识表示比在清晰领域引导的更不清晰。
https://arxiv.org/abs/2404.11744
Ontology alignment, a critical process in the Semantic Web for detecting relationships between different ontologies, has traditionally focused on identifying so-called "simple" 1-to-1 relationships through class labels and properties comparison. The more practically useful exploration of more complex alignments remains a hard problem to automate, and as such is largely underexplored, i.e. in application practice it is usually done manually by ontology and domain experts. Recently, the surge in Natural Language Processing (NLP) capabilities, driven by advancements in Large Language Models (LLMs), presents new opportunities for enhancing ontology engineering practices, including ontology alignment tasks. This paper investigates the application of LLM technologies to tackle the complex ontology alignment challenge. Leveraging a prompt-based approach and integrating rich ontology content so-called modules our work constitutes a significant advance towards automating the complex alignment task.
知识图谱对齐,作为一个在语义网中检测不同知识图谱之间关系的关键过程,通常集中在通过类标签和属性比较识别所谓的“简单”1对1关系。更实际可行的对更复杂对齐的探索仍然是一个难以自动化的困难问题,因此它仍然被大大忽视。即在应用实践中,通常是由本体和领域专家手动完成的。最近,自然语言处理(NLP)能力的突飞猛进,受到大型语言模型(LLMs)的进步,为增强语义工程实践提供了新的机会,包括语义对齐任务。本文调查了LLM技术在解决复杂语义对齐挑战中的应用。我们利用提示式方法并整合了丰富的语义内容,所谓的模块,这使得我们的工作在自动解决复杂对齐任务方面取得了显著的进展。
https://arxiv.org/abs/2404.10329
Ontology Matching (OM), is a critical task in knowledge integration, where aligning heterogeneous ontologies facilitates data interoperability and knowledge sharing. Traditional OM systems often rely on expert knowledge or predictive models, with limited exploration of the potential of Large Language Models (LLMs). We present the LLMs4OM framework, a novel approach to evaluate the effectiveness of LLMs in OM tasks. This framework utilizes two modules for retrieval and matching, respectively, enhanced by zero-shot prompting across three ontology representations: concept, concept-parent, and concept-children. Through comprehensive evaluations using 20 OM datasets from various domains, we demonstrate that LLMs, under the LLMs4OM framework, can match and even surpass the performance of traditional OM systems, particularly in complex matching scenarios. Our results highlight the potential of LLMs to significantly contribute to the field of OM.
知识集成中的元数据匹配(OM)是一个关键任务,其中对异构知识本体的对齐有助于促进数据互操作性和知识共享。传统的OM系统通常依赖于专家知识或预测模型,对大型语言模型的潜力探索有限。我们提出了LLMs4OM框架,一种评估LLM在OM任务中有效性的新方法。该框架采用两个模块进行检索和匹配,分别通过三个知识表示层的零散提示进行加强:概念、概念父体和概念子体。通过使用各种领域的20个OM数据集进行全面评估,我们证明了LLM在LLMs4OM框架下可以匹配甚至超过传统OM系统的表现,特别是在复杂匹配场景中。我们的结果突出了LLM在OM领域显著贡献的潜力。
https://arxiv.org/abs/2404.10317
The paper tackles the issue of mapping logic axioms formalised in the Ontology Web Language (OWL) within the Object-Oriented Programming (OOP) paradigm. The issues of mapping OWL axioms hierarchies and OOP objects hierarchies are due to OWL-based reasoning algorithms, which might change an OWL hierarchy at runtime; instead, OOP hierarchies are usually defined as static structures. Although programming paradigms based on reflection allow changing the OOP hierarchies at runtime and mapping OWL axioms dynamically, there are no currently available mechanisms that do not limit the reasoning algorithms. Thus, the factory-based paradigm is typically used since it decouples the OWL and OOP hierarchies. However, the factory inhibits OOP polymorphism and introduces a paradigm shift with respect to widely accepted OOP paradigms. We present the OWLOOP API, which exploits the factory to not limit reasoning algorithms, and it provides novel OOP interfaces concerning the axioms in an ontology. OWLOOP is designed to limit the paradigm shift required for using ontologies while improving, through OOP-like polymorphism, the modularity of software architectures that exploit logic reasoning. The paper details our OWL to OOP mapping mechanism, and it shows the benefits and limitations of OWLOOP through examples concerning a robot in a smart environment.
本文研究了在面向对象编程(OOP)范式内,将语义知识图谱(OWL)中的推理规则映射到OWL模型的逻辑轴理问题。OWL轴理层次结构和OOP对象层次结构的映射问题是因为基于OWL的推理算法可能会在运行时改变OWL层次结构;而OOP层次结构通常被定义为静态结构。尽管基于反思的编程范式允许在运行时改变OOP层次结构,并动态地映射OWL轴理,但目前没有可用的机制不限制推理算法。因此,通常是基于工厂的方法,因为它解耦了OWL和OOP层次结构。然而,工厂会抑制OOP多态性,并引入与广泛接受的多范式OOP范式不同的范式转变。我们提出了OWLOOP API,该API利用工厂来避免限制推理算法,并提供了关于语义模型中轴理的新颖OOP接口。OWLOOP旨在通过类似的OOP方式限制使用语义模型的范式转变,同时提高软件架构的模块性,通过逻辑推理来利用。本文详细介绍了我们的OWL到OOP映射机制,并通过一个智能环境中的人工机器人示例,展示了OWLOOP的优势和局限性。
https://arxiv.org/abs/2404.09305
Effective ontology transfer has been a major goal of recent work on event argument extraction (EAE). Two methods in particular -- question answering (QA) and template infilling (TI) -- have emerged as promising approaches to this problem. However, detailed explorations of these techniques' ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels. Further, we challenge the growing reliance on LLMs for zero-shot extraction, showing that vastly smaller models trained on an appropriate source ontology can yield zero-shot performance superior to that of GPT-3.5 or GPT-4.
有效的本体转移一直是事件论证提取(EAE)领域最近工作的主要目标。尤其是问答(QA)和模板填充(TI)两种方法——被认为是解决这个问题的有前途的方法。然而,这些技术实际实现这一转移的能力的详细探讨还缺乏。在这项工作中,我们提供了这样的研究,探讨了在句子和文档级别上使用这两种技术进行零散转移。此外,我们还挑战了越来越多地依赖LLM进行零散提取的趋势,证明了在适当的本体架构上训练的小规模模型可以产生与GPT-3.5或GPT-4.0相当甚至更好的零散性能。
https://arxiv.org/abs/2404.08579
The creation of high-quality ontologies is crucial for data integration and knowledge-based reasoning, specifically in the context of the rising data economy. However, automatic ontology matchers are often bound to the heuristics they are based on, leaving many matches unidentified. Interactive ontology matching systems involving human experts have been introduced, but they do not solve the fundamental issue of flexibly finding additional matches outside the scope of the implemented heuristics, even though this is highly demanded in industrial settings. Active machine learning methods appear to be a promising path towards a flexible interactive ontology matcher. However, off-the-shelf active learning mechanisms suffer from low query efficiency due to extreme class imbalance, resulting in a last-mile problem where high human effort is required to identify the remaining matches. To address the last-mile problem, this work introduces DualLoop, an active learning method tailored to ontology matching. DualLoop offers three main contributions: (1) an ensemble of tunable heuristic matchers, (2) a short-term learner with a novel query strategy adapted to highly imbalanced data, and (3) long-term learners to explore potential matches by creating and tuning new heuristics. We evaluated DualLoop on three datasets of varying sizes and domains. Compared to existing active learning methods, we consistently achieved better F1 scores and recall, reducing the expected query cost spent on finding 90% of all matches by over 50%. Compared to traditional interactive ontology matchers, we are able to find additional, last-mile matches. Finally, we detail the successful deployment of our approach within an actual product and report its operational performance results within the Architecture, Engineering, and Construction (AEC) industry sector, showcasing its practical value and efficiency.
高质量本体论的创建对于数据集成和基于知识的推理至关重要,尤其是在数据经济迅速崛起的背景下。然而,自动本体论匹配器通常受到其基于的启发式约束,导致许多匹配无法确定。已经引入了涉及人类专家的交互式本体论匹配系统,但这些系统并未解决实施启发式约束的基本问题,尽管在工业环境中这一点非常重要。积极机器学习方法似乎是通往具有灵活性的交互式本体论匹配器的有望之路。然而,由于极端的类别不平衡,现成的积极学习机制导致查询效率较低,导致最后1公里问题,需要高人类努力来确定剩余的匹配。为解决最后1公里问题,本文引入了DualLoop,一种专为本体论匹配的积极学习方法。DualLoop 带来了三个主要贡献:(1)可调整的启发式匹配器的集合;(2)适应高度不平衡数据的新查询策略;(3)创建并调整新本体论以探索潜在匹配。我们在三个不同规模和领域的数据集上评估了DualLoop。与现有积极学习方法相比,我们始终获得了更好的F1分数和召回,将预计查询成本用于找到90%的匹配降低了50%以上。与传统交互式本体论匹配器相比,我们能够找到额外的最后1公里匹配。最后,我们详细介绍了将我们的方法成功部署在实际产品中的情况,并报告了其在建筑、工程和 Construction(AEC)行业部门中的操作性能结果,展示了其实用价值和效率。
https://arxiv.org/abs/2404.07663
Sourcing and identification of new manufacturing partners is crucial for manufacturing system integrators to enhance agility and reduce risk through supply chain diversification in the global economy. The advent of advanced large language models has captured significant interest, due to their ability to generate comprehensive and articulate responses across a wide range of knowledge domains. However, the system often falls short in accuracy and completeness when responding to domain-specific inquiries, particularly in areas like manufacturing service discovery. This research explores the potential of leveraging Knowledge Graphs in conjunction with ChatGPT to streamline the process for prospective clients in identifying small manufacturing enterprises. In this study, we propose a method that integrates bottom-up ontology with advanced machine learning models to develop a Manufacturing Service Knowledge Graph from an array of structured and unstructured data sources, including the digital footprints of small-scale manufacturers throughout North America. The Knowledge Graph and the learned graph embedding vectors are leveraged to tackle intricate queries within the digital supply chain network, responding with enhanced reliability and greater interpretability. The approach highlighted is scalable to millions of entities that can be distributed to form a global Manufacturing Service Knowledge Network Graph that can potentially interconnect multiple types of Knowledge Graphs that span industry sectors, geopolitical boundaries, and business domains. The dataset developed for this study, now publicly accessible, encompasses more than 13,000 manufacturers' weblinks, manufacturing services, certifications, and location entity types.
采购和识别新制造商合作伙伴对全球经济中的供应链多元化至关重要,这可以提高制造系统集成商的敏捷性,并通过供应链多元化提高风险降低。先进的大型语言模型的出现引起了广泛关注,因为它们能够生成全面且明确的回答,涵盖广泛的领域知识。然而,当回答领域特定问题时,系统往往存在准确性和完整性不足的情况,特别是在制造业服务发现领域。这项研究探讨了在知识图谱与 ChatGPT 的结合下,简化潜在客户在识别小制造企业过程中的可能性。 在本研究中,我们提出了一种方法,将自下而上的本体与先进机器学习模型相结合,从包括北美地区小型制造商的数字足迹在内的一系列结构和非结构化数据源中开发出制造业服务知识图。知识图和学到的图嵌入向量被用来处理数字供应链网络中的复杂查询,并回应提高可靠性和增强可解释性的答案。 所提出的方法具有可扩展性,可以将数百万实体分配到形成一个全球制造业服务知识网络图,这个网络图可能连接多个跨越行业部门、地理政治边界和企业领域的知识图。 为这项研究创建的数据集,现已成为公开可访问的数据库,包括13,000多个制造商网站、制造业服务、认证和位置实体类型。
https://arxiv.org/abs/2404.06571
In this paper we present a publicly-available maintenance ontology (Iof-maint). Iof-maint is a modular ontology aligned with the Industrial Ontology Foundry Core (IOF Core) and contains 20 classes and 2 relations. It provides a set of maintenance-specific terms used in a wide variety of practical data-driven use cases. Iof-maint supports OWL DL reasoning, is documented, and is actively maintained on GitHub. In this paper, we describe the evolution of the Iof-maint reference ontology based on the extraction of common concepts identified in a number of application ontologies working with industry maintenance work order, procedure and failure mode data.
在本文中,我们提出了一个公开维护元数据(Iof-maint)。Iof-maint是一个与工业知识库(IOF Core)对齐的模块化元数据,包含20个类和2个关系。它提供了一组用于各种实际数据驱动用例的维护特定术语。Iof-maint支持OWL DL推理,已在GitHub上进行了记录,并正在积极维护。在本文中,我们描述了Iof-maint参考元数据基于从多个应用 ontology 中提取共性概念的演变。
https://arxiv.org/abs/2404.05224