Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems.
信息抽取的进步使得可以自动构建大型知识图谱(例如Yago、Wikidata或谷歌知识图),这些知识图谱在许多应用中得到了广泛应用,如语义搜索或数据分析。然而,由于其半自动化的构建,知识图谱通常不完整。关注从知识图中提取频繁模式并将其转化为规则的规则学习方法可以应用于预测可能缺失的事实。在这个过程中的一个关键步骤是规则排序。对规则的排序在高度不完整或 biased KGs(例如,知识图谱主要存储关于著名人物的事实)上尤其具有挑战性。在这种情况下,有偏的规则可能恰好符合数据,根据标准统计指标如规则置信度,它们将排名在顶部。为了应对这个问题,之前的工作提出,不仅要对原始 KG 中的规则进行排序,还要对 KG 嵌入模型预测的事实进行排序。同时,随着自然语言处理(NLP)模型(如语言模型)最近的应用,有几篇论文声称 LMs 可以作为 KG 完成的替代方法。在这篇论文中,我们的目标是验证 LMs 对提高规则学习系统的质量有多有用。
https://arxiv.org/abs/2409.07869
Knowledge Graph-to-Text (G2T) generation involves verbalizing structured knowledge graphs into natural language text. Recent advancements in Pretrained Language Models (PLMs) have improved G2T performance, but their effectiveness depends on datasets with precise graph-text alignment. However, the scarcity of high-quality, general-domain G2T generation datasets restricts progress in the general-domain G2T generation research. To address this issue, we introduce Wikipedia Ontology-Free Graph-text dataset (WikiOFGraph), a new large-scale G2T dataset generated using a novel method that leverages Large Language Model (LLM) and Data-QuestEval. Our new dataset, which contains 5.85M general-domain graph-text pairs, offers high graph-text consistency without relying on external ontologies. Experimental results demonstrate that PLM fine-tuned on WikiOFGraph outperforms those trained on other datasets across various evaluation metrics. Our method proves to be a scalable and effective solution for generating high-quality G2T data, significantly advancing the field of G2T generation.
知识图谱到文本(G2T)生成涉及将结构化的知识图谱转化为自然语言文本。最近,预训练语言模型(PLMs)的进步改善了G2T性能,但这些技术的有效性取决于具有精确图-文本文档对齐的数据集。然而,高质量、跨领域G2T生成数据集的稀缺性限制了其在一般领域G2T生成研究中的进展。为了解决这个问题,我们引入了维基百科免费图本文档(WikiOFGraph)作为新的大规模G2T数据集,这是通过利用大型语言模型(LLM)和数据- QuestEval方法生成的。我们的新数据集包含5850万通用领域图-文本文档对,不需要依赖于外部本体论。实验结果表明,PLM在WikiOFGraph上微调优于在其他数据集上训练的模型,各种评估指标都具有较高的图-文本文档一致性。我们的方法证明了一种可扩展和有效的生成高质量G2T数据的方法, significantly推动了G2T生成领域的发展。
https://arxiv.org/abs/2409.07088
We consider fact-checking approaches that aim to predict the veracity of assertions in knowledge graphs. Five main categories of fact-checking approaches for knowledge graphs have been proposed in the recent literature, of which each is subject to partially overlapping limitations. In particular, current text-based approaches are limited by manual feature engineering. Path-based and rule-based approaches are limited by their exclusive use of knowledge graphs as background knowledge, and embedding-based approaches suffer from low accuracy scores on current fact-checking tasks. We propose a hybrid approach -- dubbed HybridFC -- that exploits the diversity of existing categories of fact-checking approaches within an ensemble learning setting to achieve a significantly better prediction performance. In particular, our approach outperforms the state of the art by 0.14 to 0.27 in terms of Area Under the Receiver Operating Characteristic curve on the FactBench dataset. Our code is open-source and can be found at this https URL.
我们考虑旨在预测知识图谱中断言真伪的推理方法。最近文献中提出了五类知识图谱推理方法,每种方法都存在部分重叠的局限性。特别是,基于文本的方法受到手动特征工程的限制。基于路径和规则的方法仅使用知识图谱作为背景知识,而基于嵌入的方法在当前的推理任务上的准确性得分较低。我们提出了一种混合方法——称为HybridFC——利用现有事实检查方法类别的多样性,在集成学习环境中实现显著更好的预测性能。特别地,我们的方法在FactBench数据集上的接收者操作特征曲线下的面积比最先进的方法提高了0.14到0.27。我们的代码是开源的,可以在这个https URL找到。
https://arxiv.org/abs/2409.06692
Knowledge representation has been a central aim of AI since its inception. Symbolic Knowledge Graphs (KGs) and neural Large Language Models (LLMs) can both represent knowledge. KGs provide highly accurate and explicit knowledge representation, but face scalability issue; while LLMs offer expansive coverage of knowledge, but incur significant training costs and struggle with precise and reliable knowledge manipulation. To this end, we introduce OneEdit, a neural-symbolic prototype system for collaborative knowledge editing using natural language, which facilitates easy-to-use knowledge management with KG and LLM. OneEdit consists of three modules: 1) The Interpreter serves for user interaction with natural language; 2) The Controller manages editing requests from various users, leveraging the KG with rollbacks to handle knowledge conflicts and prevent toxic knowledge attacks; 3) The Editor utilizes the knowledge from the Controller to edit KG and LLM. We conduct experiments on two new datasets with KGs which demonstrate that OneEdit can achieve superior performance.
知识表示一直是人工智能的核心目标。符号知识图(KGs)和神经大语言模型(LLMs)都可以表示知识。KGs提供高度准确和明确的知识表示,但面临可扩展性问题;而LLMs则具有知识广泛覆盖,但训练成本高,且在精确和可靠的知识操作方面存在困难。为此,我们介绍了一个基于自然语言的神经符号原型系统OneEdit,用于使用自然语言进行知识编辑,以促进易于使用的知识管理和KG和LLM。OneEdit包括三个模块:1)解释器用于用户与自然语言的交互;2)控制器处理来自各种用户的编辑请求,利用KG进行回滚以处理知识冲突并防止有毒知识攻击;3)编辑器利用控制器的知识对KG和LLM进行编辑。我们在两个新的数据集上进行了实验,这些数据集具有KGs,证明了OneEdit可以实现卓越的性能。
https://arxiv.org/abs/2409.07497
A key challenge in artificial intelligence is the creation of systems capable of autonomously advancing scientific understanding by exploring novel domains, identifying complex patterns, and uncovering previously unseen connections in vast scientific data. In this work, we present SciAgents, an approach that leverages three core concepts: (1) the use of large-scale ontological knowledge graphs to organize and interconnect diverse scientific concepts, (2) a suite of large language models (LLMs) and data retrieval tools, and (3) multi-agent systems with in-situ learning capabilities. Applied to biologically inspired materials, SciAgents reveals hidden interdisciplinary relationships that were previously considered unrelated, achieving a scale, precision, and exploratory power that surpasses traditional human-driven research methods. The framework autonomously generates and refines research hypotheses, elucidating underlying mechanisms, design principles, and unexpected material properties. By integrating these capabilities in a modular fashion, the intelligent system yields material discoveries, critique and improve existing hypotheses, retrieve up-to-date data about existing research, and highlights their strengths and limitations. Our case studies demonstrate scalable capabilities to combine generative AI, ontological representations, and multi-agent modeling, harnessing a `swarm of intelligence' similar to biological systems. This provides new avenues for materials discovery and accelerates the development of advanced materials by unlocking Nature's design principles.
人工智能的一个关键挑战是创建能够通过探索新领域、识别复杂模式和揭示广泛科学数据中之前未见到的联系来自我推进科学理解的系统。在这项工作中,我们提出了SciAgents,一种利用三个核心概念的方法:(1)利用大规模本体知识图来组织和连接多样科学概念,(2)一系列大型语言模型(LLMs)和数据检索工具,(3)具有自适应学习能力的多智能体系统。将应用于生物启发材料,SciAgents揭示了之前被认为不相关的跨学科关系,实现了规模、精度和探索性力量超过传统人类驱动研究方法的效果。框架可以自主生成和研究假设,阐明潜在机制、设计原则和意外的物质性质。通过以模块化的方式整合这些能力,智能系统产生了材料发现、批判和改进现有假设、检索现有研究的最新数据,并突出了它们的优缺点。我们的案例研究展示了将生成型人工智能、本体表示和多智能体建模相结合的规模能力,类似于生物系统中的“群体智能”。这为材料发现提供了新的途径,通过揭示自然的设计原则加速了高级材料的开发。
https://arxiv.org/abs/2409.05556
The integration of Large Language Models (LLMs) with Knowledge Graphs (KGs) offers significant synergistic potential for knowledge-driven applications. One possible integration is the interpretation and generation of formal languages, such as those used in the Semantic Web, with SPARQL being a core technology for accessing KGs. In this paper, we focus on measuring out-of-the box capabilities of LLMs to work with SPARQL and more specifically with SPARQL SELECT queries applying a quantitative approach. We implemented various benchmarking tasks in the LLM-KG-Bench framework for automated execution and evaluation with several LLMs. The tasks assess capabilities along the dimensions of syntax, semantic read, semantic create, and the role of knowledge graph prompt inclusion. With this new benchmarking tasks, we evaluated a selection of GPT, Gemini, and Claude models. Our findings indicate that working with SPARQL SELECT queries is still challenging for LLMs and heavily depends on the specific LLM as well as the complexity of the task. While fixing basic syntax errors seems to pose no problems for the best of the current LLMs evaluated, creating semantically correct SPARQL SELECT queries is difficult in several cases.
大规模语言模型(LLMs)与知识图(KGs)的集成对于知识驱动的应用具有显著的协同作用。一种可能的集成是在SPARQL中解释和生成形式语言(如在语义网中使用的语言),SPARQL是访问KGs的核心技术。在本文中,我们重点关注LLMs与SPARQL的集成以及尤其是使用定量化方法评估LLM-KG-Bench框架中LLMs的出箱能力。我们为LLM-KG-Bench框架实现了各种基准测试任务,用于自动执行和评估多个LLM。这些任务评估了LLMs在语法、语义理解、语义创建以及知识图提示包含方面的能力。借助于这项新的基准测试任务,我们评估了GPT、Gemini和Claude模型。我们的研究结果表明,与SPARQL SELECT查询合作仍然对LLM具有挑战性,这在当前LLM评估中表现最差的模型上尤为明显。虽然在最佳LLM评估中修复基本语法错误并不成问题,但创建语义正确的SPARQL SELECT查询在某些情况下仍然很难。
https://arxiv.org/abs/2409.05925
Harnessing the robust capabilities of Large Language Models (LLMs) for narrative generation, logical reasoning, and common-sense knowledge integration, this study delves into utilizing LLMs to enhance automated radiology report generation (R2Gen). Despite the wealth of knowledge within LLMs, efficiently triggering relevant knowledge within these large models for specific tasks like R2Gen poses a critical research challenge. This paper presents KARGEN, a Knowledge-enhanced Automated radiology Report GENeration framework based on LLMs. Utilizing a frozen LLM to generate reports, the framework integrates a knowledge graph to unlock chest disease-related knowledge within the LLM to enhance the clinical utility of generated reports. This is achieved by leveraging the knowledge graph to distill disease-related features in a designed way. Since a radiology report encompasses both normal and disease-related findings, the extracted graph-enhanced disease-related features are integrated with regional image features, attending to both aspects. We explore two fusion methods to automatically prioritize and select the most relevant features. The fused features are employed by LLM to generate reports that are more sensitive to diseases and of improved quality. Our approach demonstrates promising results on the MIMIC-CXR and IU-Xray datasets.
利用大型语言模型(LLMs)强大的功能进行叙事生成、推理和常识知识整合,这项研究深入探讨了使用LLMs增强自动放射学报告生成(R2Gen)。尽管LLMs内部充满了丰富的知识,但在这些大型模型中, efficiently在特定任务(如R2Gen)下触发相关知识仍然是一个关键的研究挑战。本文介绍了KARGEN,一种基于LLM的知识增强自动放射学报告生成框架。通过使用冻结的LLM生成报告,该框架将知识图谱集成到LLM中,以增强生成的报告的临床实用性。这是通过利用知识图谱以有意义的方式淡化疾病相关的特征来实现的。由于放射学报告包括正常和疾病相关的发现,提取的图谱增强疾病相关的特征与区域图像特征相结合,关注两个方面。我们探讨了两种融合方法来自动优先选择和选择最相关的特征。融合后的特征被LLM用于生成更敏感于疾病的报告,质量更高。我们的方法在MIMIC-CXR和IU-Xray数据集上取得了 promising的结果。
https://arxiv.org/abs/2409.05370
Knowledge graphs (KGs) have recently been used for many tools and applications, making them rich resources in structured format. However, in the real world, KGs grow due to the additions of new knowledge in the form of entities and relations, making these KGs dynamic. This chapter formally defines several types of dynamic KGs and summarizes how these KGs can be represented. Additionally, many neurosymbolic methods have been proposed for learning representations over static KGs for several tasks such as KG completion and entity alignment. This chapter further focuses on neurosymbolic methods for dynamic KGs with or without temporal information. More specifically, it provides an insight into neurosymbolic methods for dynamic (temporal or non-temporal) KG completion and entity alignment tasks. It further discusses the challenges of current approaches and provides some future directions.
知识图(KGs)最近为许多工具和应用程序所采用,使它们在结构化格式中具有丰富的资源。然而,在现实生活中,KGs通过新知识的添加而增长,这些新知识以实体和关系的形式出现,使得这些KGs具有动态性。本章正式定义了几种动态KG,并总结了这些KG如何表示。此外,许多神经符号方法已经提出,用于在静态KG上学习表示以完成任务,如KG补全和实体对齐。本章进一步关注具有或没有时间信息的动态(时间或非时间)KG的神经符号方法。具体来说,它提供了关于神经符号方法动态(时间或非时间)KG补全和实体对齐任务的见解。它还讨论了当前方法的挑战,并提供了未来的方向。
https://arxiv.org/abs/2409.04572
Current publicly available knowledge work data collections lack diversity, extensive annotations, and contextual information about the users and their documents. These issues hinder objective and comparable data-driven evaluations and optimizations of knowledge work assistance systems. Due to the considerable resources needed to collect such data in real-life settings and the necessity of data censorship, collecting such a dataset appears nearly impossible. For this reason, we propose a configurable, multi-agent knowledge work dataset generator. This system simulates collaborative knowledge work among agents producing Large Language Model-generated documents and accompanying data traces. Additionally, the generator captures all background information, given in its configuration or created during the simulation process, in a knowledge graph. Finally, the resulting dataset can be utilized and shared without privacy or confidentiality concerns. This paper introduces our approach's design and vision and focuses on generating authentic knowledge work documents using Large Language Models. Our study involving human raters who assessed 53% of the generated and 74% of the real documents as realistic demonstrates the potential of our approach. Furthermore, we analyze the authenticity criteria mentioned in the participants' comments and elaborate on potential improvements for identified common issues.
目前可公开获取的知识工作数据集缺乏多样性、广泛的数据注释和关于用户及其文档的上下文信息。这些问题阻碍了知识工作辅助系统客观、可比较的数据驱动评估和优化。由于在现实场景中需要大量资源进行收集,数据审查的必要性,收集这样一个数据集似乎近于不可能。因此,我们提出了一个可配置的多代理知识工作数据集生成器。这个系统在生成由大型语言模型生成的文档和伴随数据痕迹的合作知识工作方面进行模拟。此外,生成器捕获了其配置或模拟过程中创建的所有背景信息,并将其存储在知识图中。最后,该生成式数据集可以用于未经隐私或保密问题影响的公开和共享。本文介绍了我们方法的设计和愿景,重点关注使用大型语言模型生成真实知识工作文档。我们的研究涉及人类评估者,他们评估了生成的和真实文档的53%和74%。我们的方法的可行性得到了阐述。此外,我们分析了参与者在评论中提到的 authenticity criteria,并深入探讨了已识别出的常见问题的改进潜力。
https://arxiv.org/abs/2409.04286
Advancements in natural language processing have revolutionized the way we can interact with digital information systems, such as databases, making them more accessible. However, challenges persist, especially when accuracy is critical, as in the biomedical domain. A key issue is the hallucination problem, where models generate information unsupported by the underlying data, potentially leading to dangerous misinformation. This paper presents a novel approach designed to bridge this gap by combining Large Language Models (LLM) and Knowledge Graphs (KG) to improve the accuracy and reliability of question-answering systems, on the example of a biomedical KG. Built on the LangChain framework, our method incorporates a query checker that ensures the syntactical and semantic validity of LLM-generated queries, which are then used to extract information from a Knowledge Graph, substantially reducing errors like hallucinations. We evaluated the overall performance using a new benchmark dataset of 50 biomedical questions, testing several LLMs, including GPT-4 Turbo and llama3:70b. Our results indicate that while GPT-4 Turbo outperforms other models in generating accurate queries, open-source models like llama3:70b show promise with appropriate prompt engineering. To make this approach accessible, a user-friendly web-based interface has been developed, allowing users to input natural language queries, view generated and corrected Cypher queries, and verify the resulting paths for accuracy. Overall, this hybrid approach effectively addresses common issues such as data gaps and hallucinations, offering a reliable and intuitive solution for question answering systems. The source code for generating the results of this paper and for the user-interface can be found in our Git repository: this https URL
自然语言处理(NLP)的进步已经彻底颠覆了我们与数字信息系统(如数据库)互动的方式,使这些系统更加易于使用。然而,尤其是当准确性至关重要时,挑战仍然存在,尤其是在生物医学领域。一个关键问题是幻觉问题,即模型生成的信息与底层数据不支持,可能导致危险的错误信息。本文提出了一种通过结合大型语言模型(LLM)和知识图(KG)来弥合这一差距的新方法,以提高问答系统的准确性和可靠性,以生物医学领域的知识图为例。该方法基于LangChain框架实现,并包括一个查询检查器,用于确保LLM生成的查询的语义和语法正确性,然后用于从知识图中提取信息,大大减少了类似于幻觉的错误。我们对50个生物医学问题的新基准数据集进行了评估,测试了包括GPT-4 Turbo和llama3:70b在内的几种LLM,结果表明,虽然GPT-4 Turbo在生成准确查询方面表现出色,但开源模型如llama3:70b表现出巨大的潜力,通过适当的提示工程。为了使这种方法易于使用,开发了一个用户友好的网页界面,使用户能够输入自然语言查询,查看生成的经纠正的Cypher查询,并验证查询结果的准确性。总的来说,这种混合方法有效地解决了常见的问题,如数据缺口和幻觉,为问答系统提供了可靠和直观的解决方案。本文生成结果的源代码和用户界面可以在我们的Git存储库中找到:https:// this URL。
https://arxiv.org/abs/2409.04181
Knowledge Graph Completion has been increasingly adopted as a useful method for several tasks in biomedical research, like drug repurposing or drug-target identification. To that end, a variety of datasets and Knowledge Graph Embedding models has been proposed over the years. However, little is known about the properties that render a dataset useful for a given task and, even though theoretical properties of Knowledge Graph Embedding models are well understood, their practical utility in this field remains controversial. We conduct a comprehensive investigation into the topological properties of publicly available biomedical Knowledge Graphs and establish links to the accuracy observed in real-world applications. By releasing all model predictions and a new suite of analysis tools we invite the community to build upon our work and continue improving the understanding of these crucial applications.
知识图 completion 作为一种在生物医学研究中越来越受欢迎的方法,用于药物再利用或药物靶点识别等任务。为此,近年来提出了许多数据集和知识图嵌入模型。然而,目前对于一个给定任务,哪种数据集是有用的仍然知之甚少。尽管关于知识图嵌入模型的理论性质已经非常了解,但它们在实际应用中的实用性仍存在争议。我们对可公开访问的生物医学知识图的拓扑性质进行全面调查,并建立了与现实应用中观察到的准确性的联系。通过发布所有模型预测和新的一套分析工具,我们邀请社区基于我们的工作继续改进对这些关键应用的理解。
https://arxiv.org/abs/2409.04103
Relation classification (RC) plays a pivotal role in both natural language understanding and knowledge graph completion. It is generally formulated as a task to recognize the relationship between two entities of interest appearing in a free-text sentence. Conventional approaches on RC, regardless of feature engineering or deep learning based, can obtain promising performance on categorizing common types of relation leaving a large proportion of unrecognizable long-tail relations due to insufficient labeled instances for training. In this paper, we consider few-shot learning is of great practical significance to RC and thus improve a modern framework of metric learning for few-shot RC. Specifically, we adopt the large-margin ProtoNet with fine-grained features, expecting they can generalize well on long-tail relations. Extensive experiments were conducted by FewRel, a large-scale supervised few-shot RC dataset, to evaluate our framework: LM-ProtoNet (FGF). The results demonstrate that it can achieve substantial improvements over many baseline approaches.
关系分类(RC)在自然语言理解和知识图谱完成中扮演着关键角色。通常,RC被建模为在自由文本句子中识别两个感兴趣实体之间的关系。传统的关系分类方法(无论是否基于特征工程或深度学习)在分类常见关系类型的过程中都表现出良好的性能,但由于训练数据不足,它们在识别长尾关系方面存在很大的不确定性。在本文中,我们考虑了少样本学习对RC具有很大的实际意义,从而改进了一个现代少样本关系分类框架。具体来说,我们采用了具有细粒度特征的大型边缘原型网络,期望它们在长尾关系上表现良好。通过对FewRel等大规模监督少样本RC数据集的实验,我们评估了我们的框架:LM-ProtoNet (FGF)。结果表明,它可以实现显著的提高,超过许多基线方法。
https://arxiv.org/abs/2409.04009
While large multimodal models (LMMs) have obtained strong performance on many multimodal tasks, they may still hallucinate while generating text. Their performance on detecting salient features from visual data is also unclear. In this paper, we develop a framework to generate faithful and salient text from mixed-modal data, which includes images and structured data ( represented in knowledge graphs or tables). Specifically, we train a small vision critic model to identify hallucinated and non-salient features from the image modality. The critic model also generates a list of salient image features. This information is used in the post editing step to improve the generation quality. Experiments on two datasets show that our framework improves LMMs' generation quality on both faithfulness and saliency, outperforming recent techniques aimed at reducing hallucination.
虽然大型多模态模型(LMMs)已经在许多多模态任务上取得了强大的性能,但在生成文本时,它们可能会产生幻觉。它们在从视觉数据中检测显著特征的性能也不清楚。在本文中,我们开发了一个框架,可以从混合数据中生成忠实且显著的文本,包括图像和结构数据(用知识图或表格表示)。具体来说,我们训练了一个小型的视觉批评模型,用于识别图像模态中的幻觉和非显著特征。批评模型还生成了一组显著的图像特征列表。这一信息被用于后编辑步骤,以提高生成质量。在两个数据集上的实验表明,我们的框架在忠实性和显著性方面都提高了LMM的生成质量,超过了旨在减少幻觉的最近技术。
https://arxiv.org/abs/2409.03961
To protect patient safety, modern pharmaceutical complexity demands strict prescription verification. We offer a new approach - Rx Strategist - that makes use of knowledge graphs and different search strategies to enhance the power of Large Language Models (LLMs) inside an agentic framework. This multifaceted technique allows for a multi-stage LLM pipeline and reliable information retrieval from a custom-built active ingredient database. Different facets of prescription verification, such as indication, dose, and possible drug interactions, are covered in each stage of the pipeline. We alleviate the drawbacks of monolithic LLM techniques by spreading reasoning over these stages, improving correctness and reliability while reducing memory demands. Our findings demonstrate that Rx Strategist surpasses many current LLMs, achieving performance comparable to that of a highly experienced clinical pharmacist. In the complicated world of modern medications, this combination of LLMs with organized knowledge and sophisticated search methods presents a viable avenue for reducing prescription errors and enhancing patient outcomes.
为保障患者安全,现代药品复杂性要求严格处方验证。我们提供了一种新方法——Rx Strategist,它利用知识图谱和不同的搜索策略来增强框架内大型语言模型(LLMs)的力量。这种多面技术允许在一个智能体框架内实现多阶段LLM管道,并从自定义活性成分数据库中进行可靠的检索信息。该方法涵盖了处方验证的不同方面,如指示、剂量和可能的药物相互作用,并在每个阶段覆盖这些方面。通过将推理分散在这些阶段中,提高正确性和可靠性,同时降低内存需求,我们缓解了单体LLM技术的缺点。我们的研究结果表明,Rx Strategist超越了许多现有LLM,实现了与有经验临床药师的性能相似的表现。在现代药物复杂的世界中,这种将LLM与组织知识和高超搜索方法相结合的方法为减少处方错误和提高患者 outcomes提供了可行的途径。
https://arxiv.org/abs/2409.03440
Most available data is unstructured, making it challenging to access valuable information. Automatically building Knowledge Graphs (KGs) is crucial for structuring data and making it accessible, allowing users to search for information effectively. KGs also facilitate insights, inference, and reasoning. Traditional NLP methods, such as named entity recognition and relation extraction, are key in information retrieval but face limitations, including the use of predefined entity types and the need for supervised learning. Current research leverages large language models' capabilities, such as zero- or few-shot learning. However, unresolved and semantically duplicated entities and relations still pose challenges, leading to inconsistent graphs and requiring extensive post-processing. Additionally, most approaches are topic-dependent. In this paper, we propose iText2KG, a method for incremental, topic-independent KG construction without post-processing. This plug-and-play, zero-shot method is applicable across a wide range of KG construction scenarios and comprises four modules: Document Distiller, Incremental Entity Extractor, Incremental Relation Extractor, and Graph Integrator and Visualization. Our method demonstrates superior performance compared to baseline methods across three scenarios: converting scientific papers to graphs, websites to graphs, and CVs to graphs.
大多数可用数据都是无结构化的,这使得访问有价值的信息变得具有挑战性。自动构建知识图(KGs)对组织和提供数据访问至关重要,使用户能够有效地搜索信息。KGs还促进洞察、推理和归纳。传统的自然语言处理方法(如命名实体识别和关系提取)在信息检索中至关重要,但面临局限性,包括使用预定义实体类型和需要进行监督学习。当前的研究利用大型语言模型的能力,例如零或少数 shot学习。然而,未解决和语义上重复的实体和关系仍然具有挑战性,导致不一致的图形,并需要进行广泛的后处理。此外,大多数方法都是基于主题的。在本文中,我们提出了iText2KG,一种无需后处理即可逐步、主题无关地构建KG的方法。这种可插拔、零 shot的方法适用于广泛的KG构建场景,包括四个模块:文档蒸馏器、逐步实体提取器、逐步关系提取器和图形整合器与可视化。我们的方法在三个场景中的基准方法上显示出卓越的性能:将科学论文转换为图形、将网站转换为图形和将简历转换为图形。
https://arxiv.org/abs/2409.03284
Detecting anomalies in human-related videos is crucial for surveillance applications. Current methods primarily include appearance-based and action-based techniques. Appearance-based methods rely on low-level visual features such as color, texture, and shape. They learn a large number of pixel patterns and features related to known scenes during training, making them effective in detecting anomalies within these familiar contexts. However, when encountering new or significantly changed scenes, i.e., unknown scenes, they often fail because existing SOTA methods do not effectively capture the relationship between actions and their surrounding scenes, resulting in low generalization. In contrast, action-based methods focus on detecting anomalies in human actions but are usually less informative because they tend to overlook the relationship between actions and their scenes, leading to incorrect detection. For instance, the normal event of running on the beach and the abnormal event of running on the street might both be considered normal due to the lack of scene information. In short, current methods struggle to integrate low-level visual and high-level action features, leading to poor anomaly detection in varied and complex scenes. To address this challenge, we propose a novel decoupling-based architecture for human-related video anomaly detection (DecoAD). DecoAD significantly improves the integration of visual and action features through the decoupling and interweaving of scenes and actions, thereby enabling a more intuitive and accurate understanding of complex behaviors and scenes. DecoAD supports fully supervised, weakly supervised, and unsupervised settings.
在人机视频中的异常检测对监控应用至关重要。目前的方法主要包括基于外观和基于动作的检测技术。基于外观的方法依赖于低级别的视觉特征,如颜色、纹理和形状。它们在训练过程中学习大量与已知场景相关的像素模式和特征,因此对于熟悉场景的异常检测非常有效。然而,当遇到新的或显著变化的场景时,即未知场景时,它们通常会失败,因为现有的SOTA方法没有有效地捕捉动作与周围场景之间的关系,导致泛化能力差。相比之下,基于动作的方法专注于检测人动作的异常,但通常不够 informative,因为它往往忽视了动作与场景之间的关系,导致错误的检测。例如,在沙滩上跑步的正常事件和街上跑步的异常事件都可能被视为正常,因为缺乏场景信息。简而言之,当前的方法在低级视觉和高级动作特征的集成上存在困难,导致在复杂和多样场景中的异常检测能力不足。为解决这个问题,我们提出了一个新颖的解耦基于架构的人机视频异常检测(DecoAD)。DecoAD通过解耦和操作场景和动作,显著提高了视觉和动作特征的集成,从而实现了更直观和准确地理解复杂的行为和场景。DecoAD支持完全监督、弱监督和无监督设置。
https://arxiv.org/abs/2409.03236
Large Language Models (LLMs) may suffer from hallucinations in real-world applications due to the lack of relevant knowledge. In contrast, knowledge graphs encompass extensive, multi-relational structures that store a vast array of symbolic facts. Consequently, integrating LLMs with knowledge graphs has been extensively explored, with Knowledge Graph Question Answering (KGQA) serving as a critical touchstone for the integration. This task requires LLMs to answer natural language questions by retrieving relevant triples from knowledge graphs. However, existing methods face two significant challenges: \textit{excessively long reasoning paths distracting from the answer generation}, and \textit{false-positive relations hindering the path refinement}. In this paper, we propose an iterative interactive KGQA framework that leverages the interactive learning capabilities of LLMs to perform reasoning and Debating over Graphs (DoG). Specifically, DoG employs a subgraph-focusing mechanism, allowing LLMs to perform answer trying after each reasoning step, thereby mitigating the impact of lengthy reasoning paths. On the other hand, DoG utilizes a multi-role debate team to gradually simplify complex questions, reducing the influence of false-positive relations. This debate mechanism ensures the reliability of the reasoning process. Experimental results on five public datasets demonstrate the effectiveness and superiority of our architecture. Notably, DoG outperforms the state-of-the-art method ToG by 23.7\% and 9.1\% in accuracy on WebQuestions and GrailQA, respectively. Furthermore, the integration experiments with various LLMs on the mentioned datasets highlight the flexibility of DoG. Code is available at \url{this https URL}.
大语言模型(LLMs)在现实应用中可能因为缺乏相关知识而产生幻觉。相比之下,知识图谱涵盖了一个广泛的、多关系结构的体系,存储了大量的符号事实。因此,将LLM与知识图谱集成已经得到了广泛研究,而知识图谱问题回答(KGQA)被认为是集成的关键点。这项任务要求LLM通过从知识图中检索相关三元组来回答自然语言问题。然而,现有的方法面临着两个重大挑战:\textit{过长的推理路径会分散回答,导致答案不准确} 和 \textit{错误的正向关系会阻碍路径细化。在本文中,我们提出了一个迭代交互式KGQA框架,利用LLM的交互学习能力在图形上进行推理和辩论(DoG)。具体来说,DoG采用子图关注机制,使得LLM在推理步骤后能够尝试回答问题,从而减轻长推理路径的影响。另一方面,DoG利用多角色辩论团队逐步简化复杂问题,减少了虚假正向关系对路径细化的影响。这一辩论机制确保了推理过程的可靠性。在五个公开数据集上的实验结果表明,我们的架构的有效性和优越性。值得注意的是,DoG在WebQuestions和GrailQA上的性能均优于最先进的ToG,分别提高了23.7\%和9.1\%。此外,在提到的数据集上使用各种LLM的集成实验进一步证明了DoG的灵活性。代码可在此处访问:\url{这个链接}。
https://arxiv.org/abs/2409.03155
Large Language Models (LLMs) frequently lack domain-specific knowledge and even fine-tuned models tend to hallucinate. Hence, more reliable models that can include external knowledge are needed. We present a pipeline, 4StepFocus, and specifically a preprocessing step, that can substantially improve the answers of LLMs. This is achieved by providing guided access to external knowledge making use of the model's ability to capture relational context and conduct rudimentary reasoning by themselves. The method narrows down potentially correct answers by triplets-based searches in a semi-structured knowledge base in a direct, traceable fashion, before switching to latent representations for ranking those candidates based on unstructured data. This distinguishes it from related methods that are purely based on latent representations. 4StepFocus consists of the steps: 1) Triplet generation for extraction of relational data by an LLM, 2) substitution of variables in those triplets to narrow down answer candidates employing a knowledge graph, 3) sorting remaining candidates with a vector similarity search involving associated non-structured data, 4) reranking the best candidates by the LLM with background data provided. Experiments on a medical, a product recommendation, and an academic paper search test set demonstrate that this approach is indeed a powerful augmentation. It not only adds relevant traceable background information from information retrieval, but also improves performance considerably in comparison to state-of-the-art methods. This paper presents a novel, largely unexplored direction and therefore provides a wide range of future work opportunities. Used source code is available at this https URL.
大语言模型(LLMs)经常缺乏领域特定的知识,甚至经过微调的模型也经常出现幻觉。因此,需要更可靠的模型来包括外部知识。我们提出了一个名为4StepFocus的管道和 specifically一个预处理步骤,可以显著提高LLM的答案。通过利用模型捕捉关系上下文和进行初步推理的能力,为外部知识提供指导,从而缩小可能正确的答案。在半结构化知识库中进行三元组基于搜索的方法使结果具有直接和可追踪性,在切换到基于无结构数据为候选者排序之前。这使得它与仅基于隐含表示的方法区别开来。4StepFocus由以下步骤组成:1)为LLM生成三元组,用于提取关系数据,2)将三元组中的变量替换为知识图中的变量,以缩小使用知识图回答候选者,3)使用向量相似性搜索对剩余的候选者进行排序,4)通过LLM提供背景数据对最佳候选人进行重新排序。在医学、产品推荐和学术论文搜索测试集中进行的实验证明,这种方法确实是一种强大的增强。它不仅为信息检索提供相关可追踪的背景信息,而且与最先进的 methods相比,性能显著提高。本文提出了一个新颖的、尚未完全探索的方向,因此为未来工作提供了广泛的机会。使用的源代码可在此处下载:https://url.com/
https://arxiv.org/abs/2409.00861
This paper presents an ontology design along with knowledge engineering, and multilingual semantic reasoning techniques to build an automated system for assimilating culinary information for Indian food in the form of a knowledge graph. The main focus is on designing intelligent methods to derive ontology designs and capture all-encompassing knowledge about food, recipes, ingredients, cooking characteristics, and most importantly, nutrition, at scale. We present our ongoing work in this workshop paper, describe in some detail the relevant challenges in curating knowledge of Indian food, and propose our high-level ontology design. We also present a novel workflow that uses AI, LLM, and language technology to curate information from recipe blog sites in the public domain to build knowledge graphs for Indian food. The methods for knowledge curation proposed in this paper are generic and can be replicated for any domain. The design is application-agnostic and can be used for AI-driven smart analysis, building recommendation systems for Personalized Digital Health, and complementing the knowledge graph for Indian food with contextual information such as user information, food biochemistry, geographic information, agricultural information, etc.
本文提出了一种本体设计、知识工程以及多语言语义推理技术,以构建一个自动化的系统,将印度美食烹饪信息转化为知识图谱。本文的主要关注点是设计智能方法,以从规模上提取本体设计并捕捉关于食物、食谱、食材、烹饪特点和最重要的是营养的全部知识。我们在本的工作论文中描述了当前的工作,详细介绍了用于印度美食知识策展的相关挑战,并提出了我们的高级本体设计。我们还介绍了一种新颖的工作流程,利用AI、LLM和语言技术从公共领域的食谱博客网站中策展信息,构建印度美食知识图谱。本文提出的方法是通用的,可以复制到任何领域。该设计无国界,可以用于AI驱动的智能分析和为个性化数字健康构建推荐系统,以及补充印度美食知识图谱的上下文信息,如用户信息、食品生物化学、地理信息、农业信息等。
https://arxiv.org/abs/2409.00830
Recently, there has been an increasing interest in the construction of general-domain and domain-specific causal knowledge graphs. Such knowledge graphs enable reasoning for causal analysis and event prediction, and so have a range of applications across different domains. While great progress has been made toward automated construction of causal knowledge graphs, the evaluation of such solutions has either focused on low-level tasks (e.g., cause-effect phrase extraction) or on ad hoc evaluation data and small manual evaluations. In this paper, we present a corpus, task, and evaluation framework for causal knowledge graph construction. Our corpus consists of Wikipedia articles for a collection of event-related concepts in Wikidata. The task is to extract causal relations between event concepts from the corpus. The evaluation is performed in part using existing causal relations in Wikidata to measure recall, and in part using Large Language Models to avoid the need for manual or crowd-sourced evaluation. We evaluate a pipeline for causal knowledge graph construction that relies on neural models for question answering and concept linking, and show how the corpus and the evaluation framework allow us to effectively find the right model for each task. The corpus and the evaluation framework are publicly available.
近年来,人们对构建一般领域和领域特定因果知识图越来越感兴趣。这样的知识图能够进行因果分析推理和事件预测,在各个领域具有广泛的应用。虽然已经取得很大的进展使自动构建因果知识图变得容易,但评估这些解决方案却集中关注低级任务(例如,因果短语提取)或基于人为手动评估的应急数据,以及小规模的自动评估。在本文中,我们提出了一个用于因果知识图构建的语料库、任务和评估框架。我们的语料库包括Wikidata中与事件相关的多个概念的文章。任务是从语料库中提取事件概念之间的因果关系。评估部分使用现有的人工因果关系来测量召回,部分使用大型语言模型来避免需要手动或 crowd-sourced评估。我们评估了一个依赖神经网络进行问题回答和概念链接的因果知识图构建流程,并展示了语料库和评估框架如何让我们有效地为每个任务找到合适的模型。语料库和评估框架都可以公开使用。
https://arxiv.org/abs/2409.00331