Machine listening systems often rely on fixed taxonomies to organize and label audio data, key for training and evaluating deep neural networks (DNNs) and other supervised algorithms. However, such taxonomies face significant constraints: they are composed of application-dependent predefined categories, which hinders the integration of new or varied sounds, and exhibits limited cross-dataset compatibility due to inconsistent labeling standards. To overcome these limitations, we introduce SALT: Standardized Audio event Label Taxonomy. Building upon the hierarchical structure of AudioSet's ontology, our taxonomy extends and standardizes labels across 24 publicly available environmental sound datasets, allowing the mapping of class labels from diverse datasets to a unified system. Our proposal comes with a new Python package designed for navigating and utilizing this taxonomy, easing cross-dataset label searching and hierarchical exploration. Notably, our package allows effortless data aggregation from diverse sources, hence easy experimentation with combined datasets.
机器听觉系统通常依赖于固定的分类系统来组织和标记音频数据,这对于训练和评估深度神经网络(DNNs)和其他监督算法至关重要。然而,这样的分类系统面临着显著的限制:它们由基于应用的预定义类别组成,这阻碍了新型或多样声音的集成,并且由于不一致的标注标准,表现出高度的跨数据集不兼容性。为了克服这些限制,我们引入了SALT:标准化音频事件标签分类。 我们的分类系统基于AudioSet的语义层次结构,扩展和标准化了24个公开环境声音数据集中的标签。这使得可以将类标签从各种数据集中映射到统一的系统。我们的提议附带了一个新的Python包,用于导航和使用这一分类系统,轻松进行跨数据集标签搜索和层次结构探索。值得注意的是,我们的软件允许从各种来源轻松聚合数据,因此可以轻松尝试结合数据集的实验。
https://arxiv.org/abs/2409.11746
Human cognition can leverage fundamental conceptual knowledge, like geometric and kinematic ones, to appropriately perceive, comprehend and interact with novel objects. Motivated by this finding, we aim to endow machine intelligence with an analogous capability through performing at the conceptual level, in order to understand and then interact with articulated objects, especially for those in novel categories, which is challenging due to the intricate geometric structures and diverse joint types of articulated objects. To achieve this goal, we propose Analytic Ontology Template (AOT), a parameterized and differentiable program description of generalized conceptual ontologies. A baseline approach called AOTNet driven by AOTs is designed accordingly to equip intelligent agents with these generalized concepts, and then empower the agents to effectively discover the conceptual knowledge on the structure and affordance of articulated objects. The AOT-driven approach yields benefits in three key perspectives: i) enabling concept-level understanding of articulated objects without relying on any real training data, ii) providing analytic structure information, and iii) introducing rich affordance information indicating proper ways of interaction. We conduct exhaustive experiments and the results demonstrate the superiority of our approach in understanding and then interacting with articulated objects.
人类认知可以利用基本的概念知识,如几何和运动学知识,适当地感知、理解和与新物体互动。为了实现这一发现,我们旨在通过在概念层面上进行操作,赋予机器智能类似的能力,以便理解和然后与 articulated物体互动,尤其是对于那些新类别的物体,这由于复杂的几何结构和多样关节类型而具有挑战性。为实现这一目标,我们提出了Analytic Ontology Template(AOT),一种参数化和可导的程序描述通用概念本体的方法。基于AOTs的AOTNet是一种基准方法,旨在为智能体提供这些通用概念,并使智能体能够有效发现 articulated 物体的结构和表示。AOT驱动的方法在三个关键方面具有优势:i)无需依赖任何真实训练数据,实现对关节对象的 Concept-level 理解;ii)提供分析结构信息;iii)引入丰富的 affordance 信息,表明适当的行为方式。我们进行了全面的实验,结果表明,我们的方法在理解和然后与关节物体互动方面具有优越性。
https://arxiv.org/abs/2409.11702
Music-text multimodal systems have enabled new approaches to Music Information Research (MIR) applications such as audio-to-text and text-to-audio retrieval, text-based song generation, and music captioning. Despite the reported success, little effort has been put into evaluating the musical knowledge of Large Language Models (LLM). In this paper, we demonstrate that LLMs suffer from 1) prompt sensitivity, 2) inability to model negation (e.g. 'rock song without guitar'), and 3) sensitivity towards the presence of specific words. We quantified these properties as a triplet-based accuracy, evaluating the ability to model the relative similarity of labels in a hierarchical ontology. We leveraged the Audioset ontology to generate triplets consisting of an anchor, a positive (relevant) label, and a negative (less relevant) label for the genre and instruments sub-tree. We evaluated the triplet-based musical knowledge for six general-purpose Transformer-based models. The triplets obtained through this methodology required filtering, as some were difficult to judge and therefore relatively uninformative for evaluation purposes. Despite the relatively high accuracy reported, inconsistencies are evident in all six models, suggesting that off-the-shelf LLMs need adaptation to music before use.
多媒体文本音乐系统已经促进了音乐信息研究(MIR)应用的新方法,例如音频到文本和文本到音频检索,基于文本的歌曲生成和音乐字幕。尽管报告取得了成功,但很少有人对大型语言模型(LLM)的音乐知识进行评估。在本文中,我们证明了LLM存在以下问题:1)提示敏感性,2)无法建模否定(例如“没有吉他的摇滚歌曲”),3)对特定单词存在敏感性。我们将这些属性量化为三元组基于准确性的方式,评估了在层次化本体论中模型对标签相对相似性的建模能力。我们利用Audioset本体来生成包含锚点、正(相关)标签和负(不相关)标签的三元组,用于评估六个通用Transformer基模型。通过这种方法获得的三元组需要过滤,因为有些难以判断,因此对评估目的来说相对不具有信息价值。尽管报告的准确度相对较高,但所有六个模型都存在不稳定性,表明在投入使用之前,LLM需要对音乐进行适应。
https://arxiv.org/abs/2409.11449
This paper outlines the LLMs4OL 2024, the first edition of the Large Language Models for Ontology Learning Challenge. LLMs4OL is a community development initiative collocated with the 23rd International Semantic Web Conference (ISWC) to explore the potential of Large Language Models (LLMs) in Ontology Learning (OL), a vital process for enhancing the web with structured knowledge to improve interoperability. By leveraging LLMs, the challenge aims to advance understanding and innovation in OL, aligning with the goals of the Semantic Web to create a more intelligent and user-friendly web. In this paper, we give an overview of the 2024 edition of the LLMs4OL challenge and summarize the contributions.
本文概述了 LLMs4OL 2024,第一届全国大型语言模型学习面向本体学习挑战赛。LLMs4OL 是一个与第 23 届国际语义网会议(ISWC)合作发展的社区倡议,旨在探索大型语言模型(LLMs)在面向本体学习的潜力,这是一种提高网络互操作性至关重要的过程。通过利用 LLMs,挑战旨在推动在面向本体学习(OL)方面的理解和创新,与语义网的目标相一致,创建一个更加智能和用户友好的网络。在本文中,我们概述了 2024 年 LLMs4OL 挑战赛的基本情况,并总结了其贡献。
https://arxiv.org/abs/2409.10146
Along with the rapid growth of autonomous vehicles (AVs), more and more demands are required for environment perception technology. Among others, HD mapping has become one of the more prominent roles in helping the vehicle realize essential tasks such as localization and path planning. While increasing research efforts have been directed toward HD Map development. However, a comprehensive overview of the overall HD map mapping and update framework is still lacking. This article introduces the development and current state of the algorithm involved in creating HD map mapping and its maintenance. As part of this study, the primary data preprocessing approach of processing raw data to information ready to feed for mapping and update purposes, semantic segmentation, and localization are also briefly reviewed. Moreover, the map taxonomy, ontology, and quality assessment are extensively discussed, the map data's general representation method is presented, and the mapping algorithm ranging from SLAM to transformers learning-based approaches are also discussed. The development of the HD map update algorithm, from change detection to the update methods, is also presented. Finally, the authors discuss possible future developments and the remaining challenges in HD map mapping and update technology. This paper simultaneously serves as a position paper and tutorial to those new to HD map mapping and update domains.
随着自动驾驶车辆(AVs)的快速发展,对环境感知技术的需求也在不断增加。其中,高精地图(HD Map)在帮助车辆实现关键任务(如定位和路径规划)方面发挥了越来越重要的作用。虽然越来越多的研究努力都指向HD Map的开发。然而,对整个HD Map映射和更新框架的全面概述仍然缺乏。本文介绍了创建HD Map映射及其维护的算法的开发和当前状态。作为本研究的一部分,还简要回顾了处理原始数据以供映射和更新目的的信息预处理方法、语义分割和定位。此外,还详细讨论了地图分类、本体、质量评估,以及从SLAM到基于变换器的学习方法映射的映射算法。HD Map更新算法的开发,从更改检测到更新方法,也进行了介绍。最后,作者讨论了HD Map映射和更新技术可能出现的未来发展和剩余挑战。本文既是针对HD Map映射和更新领域的新手的一份论文,也是一份指导。
https://arxiv.org/abs/2409.09726
Competency question (CQ) formulation is central to several ontology development and evaluation methodologies. Traditionally, the task of crafting these competency questions heavily relies on the effort of domain experts and knowledge engineers which is often time-consuming and labor-intensive. With the emergence of Large Language Models (LLMs), there arises the possibility to automate and enhance this process. Unlike other similar works which use existing ontologies or knowledge graphs as input to LLMs, we present a retrieval-augmented generation (RAG) approach that uses LLMs for the automatic generation of CQs given a set of scientific papers considered to be a domain knowledge base. We investigate its performance and specifically, we study the impact of different number of papers to the RAG and different temperature setting of the LLM. We conduct experiments using GPT-4 on two domain ontology engineering tasks and compare results against ground-truth CQs constructed by domain experts. Empirical assessments on the results, utilizing evaluation metrics (precision and consistency), reveal that compared to zero-shot prompting, adding relevant domain knowledge to the RAG improves the performance of LLMs on generating CQs for concrete ontology engineering tasks.
能力问题(CQ)的形式化是几种知识图谱开发和评估方法的核心。传统上,制定这些能力问题很大程度上依赖于领域专家和知识工程师的辛勤努力,这往往需要耗费大量时间和精力。随着大型语言模型的出现,出现了自动化和增强这个过程的可能性。与使用现有知识图或现有 ontology 作为大型语言模型的输入相比,我们提出了一个检索增强生成(RAG)方法,该方法使用大型语言模型自动生成 CQ,这些科学论文被认为是一个领域知识库。我们研究了其性能,特别是,研究了不同论文数量对 RAG 的影响以及 LLM 的温度设置。我们在两个领域本体工程任务上使用 GPT-4 进行实验,并将结果与由领域专家构建的地面真实 CQ 进行比较。使用评估指标(精确度和一致性)进行实证评估,结果表明,将相关领域知识添加到 RAG 提高了 LLM 在生成混凝土本体工程任务中生成 CQ 的性能。
https://arxiv.org/abs/2409.08820
Knowledge Graph-to-Text (G2T) generation involves verbalizing structured knowledge graphs into natural language text. Recent advancements in Pretrained Language Models (PLMs) have improved G2T performance, but their effectiveness depends on datasets with precise graph-text alignment. However, the scarcity of high-quality, general-domain G2T generation datasets restricts progress in the general-domain G2T generation research. To address this issue, we introduce Wikipedia Ontology-Free Graph-text dataset (WikiOFGraph), a new large-scale G2T dataset generated using a novel method that leverages Large Language Model (LLM) and Data-QuestEval. Our new dataset, which contains 5.85M general-domain graph-text pairs, offers high graph-text consistency without relying on external ontologies. Experimental results demonstrate that PLM fine-tuned on WikiOFGraph outperforms those trained on other datasets across various evaluation metrics. Our method proves to be a scalable and effective solution for generating high-quality G2T data, significantly advancing the field of G2T generation.
知识图谱到文本(G2T)生成涉及将结构化的知识图谱转化为自然语言文本。最近,预训练语言模型(PLMs)的进步改善了G2T性能,但这些技术的有效性取决于具有精确图-文本文档对齐的数据集。然而,高质量、跨领域G2T生成数据集的稀缺性限制了其在一般领域G2T生成研究中的进展。为了解决这个问题,我们引入了维基百科免费图本文档(WikiOFGraph)作为新的大规模G2T数据集,这是通过利用大型语言模型(LLM)和数据- QuestEval方法生成的。我们的新数据集包含5850万通用领域图-文本文档对,不需要依赖于外部本体论。实验结果表明,PLM在WikiOFGraph上微调优于在其他数据集上训练的模型,各种评估指标都具有较高的图-文本文档一致性。我们的方法证明了一种可扩展和有效的生成高质量G2T数据的方法, significantly推动了G2T生成领域的发展。
https://arxiv.org/abs/2409.07088
Vehicles in public traffic that are equipped with Automated Driving Systems are subject to a number of expectations: Among other aspects, their behavior should be safe, conforming to the rules of the road and provide mobility to their users. This poses challenges for the developers of such systems: Developers are responsible for specifying this behavior, for example, in terms of requirements at system design time. As we will discuss in the article, this specification always involves the need for assumptions and trade-offs. As a result, insufficiencies in such a behavior specification can occur that can potentially lead to unsafe system behavior. In order to support the identification of specification insufficiencies, requirements and respective assumptions need to be made explicit. In this article, we propose the Semantic Norm Behavior Analysis as an ontology-based approach to specify the behavior for an Automated Driving System equipped vehicle. We use ontologies to formally represent specified behavior for a targeted operational environment, and to establish traceability between specified behavior and the addressed stakeholder needs. Furthermore, we illustrate the application of the Semantic Norm Behavior Analysis in two example scenarios and evaluate our results.
在公共道路上装备自动驾驶系统的车辆,受到一系列期望的限制:例如,其行为应安全,符合道路规则,并为用户提供便利。这对系统开发者来说是一个挑战:例如,在系统设计时指定其行为。正如本文中所述,这种指定总是涉及假设和权衡。因此,在某种行为规范的规范性定义中,可能会出现不足之处,这可能导致不安全的系统行为。为了支持规范性定义不足的识别,需要明确需求和相应的假设。在这篇文章中,我们提出了基于语义网络的行为分析作为一种基于语义网络指定自动驾驶系统车辆行为的 ontology 方法。我们使用语义网络正式表示指定行为,并建立指定行为和目标受众需求之间的可追溯性。此外,我们还展示了 Semantic Norm Behavior Analysis 在两个示例场景中的应用,并评估了我们的结果。
https://arxiv.org/abs/2409.06607
This paper presents an innovative data-centric paradigm for designing computational systems by introducing a new informatics domain model. The proposed model moves away from the conventional node-centric framework and focuses on data-centric categorization, using a multimodal approach that incorporates objects, events, concepts, and actions. By drawing on interdisciplinary research and establishing a foundational ontology based on these core elements, the model promotes semantic consistency and secure data handling across distributed ecosystems. We also explore the implementation of this model as an OWL 2 ontology, discuss its potential applications, and outline its scalability and future directions for research. This work aims to serve as a foundational guide for system designers and data architects in developing more secure, interoperable, and scalable data systems.
本文提出了一种创新的数据中心化方法来设计计算系统,引入了一个新的信息论领域模型。所提出的模型摒弃了传统的节点中心框架,重点关注数据分类,采用了一种多模态方法,将对象、事件、概念和动作融入其中。通过借鉴跨学科研究并建立基于这些核心元素的基石本体论,该模型促进了语义一致性和安全数据处理跨分布式生态系统的实现。我们还讨论了将此模型作为一种OWL 2本体的实现,以及其潜在应用和研究的可扩展性和未来方向。本研究旨在为系统设计师和数据架构师在开发更安全、可互操作且可扩展的数据系统方面提供基础指导。
https://arxiv.org/abs/2409.09058
This paper presents an ontology design along with knowledge engineering, and multilingual semantic reasoning techniques to build an automated system for assimilating culinary information for Indian food in the form of a knowledge graph. The main focus is on designing intelligent methods to derive ontology designs and capture all-encompassing knowledge about food, recipes, ingredients, cooking characteristics, and most importantly, nutrition, at scale. We present our ongoing work in this workshop paper, describe in some detail the relevant challenges in curating knowledge of Indian food, and propose our high-level ontology design. We also present a novel workflow that uses AI, LLM, and language technology to curate information from recipe blog sites in the public domain to build knowledge graphs for Indian food. The methods for knowledge curation proposed in this paper are generic and can be replicated for any domain. The design is application-agnostic and can be used for AI-driven smart analysis, building recommendation systems for Personalized Digital Health, and complementing the knowledge graph for Indian food with contextual information such as user information, food biochemistry, geographic information, agricultural information, etc.
本文提出了一种本体设计、知识工程以及多语言语义推理技术,以构建一个自动化的系统,将印度美食烹饪信息转化为知识图谱。本文的主要关注点是设计智能方法,以从规模上提取本体设计并捕捉关于食物、食谱、食材、烹饪特点和最重要的是营养的全部知识。我们在本的工作论文中描述了当前的工作,详细介绍了用于印度美食知识策展的相关挑战,并提出了我们的高级本体设计。我们还介绍了一种新颖的工作流程,利用AI、LLM和语言技术从公共领域的食谱博客网站中策展信息,构建印度美食知识图谱。本文提出的方法是通用的,可以复制到任何领域。该设计无国界,可以用于AI驱动的智能分析和为个性化数字健康构建推荐系统,以及补充印度美食知识图谱的上下文信息,如用户信息、食品生物化学、地理信息、农业信息等。
https://arxiv.org/abs/2409.00830
Discovering individuals depression on social media has become increasingly important. Researchers employed ML/DL or lexicon-based methods for automated depression detection. Lexicon based methods, explainable and easy to implement, match words from user posts in a depression dictionary without considering contexts. While the DL models can leverage contextual information, their black-box nature limits their adoption in the domain. Though surrogate models like LIME and SHAP can produce explanations for DL models, the explanations are suitable for the developer and of limited use to the end user. We propose a Knolwedge-infused Neural Network (KiNN) incorporating domain-specific knowledge from DepressionFeature ontology (DFO) in a neural network to endow the model with user-level explainability regarding concepts and processes the clinician understands. Further, commonsense knowledge from the Commonsense Transformer (COMET) trained on ATOMIC is also infused to consider the generic emotional aspects of user posts in depression detection. The model is evaluated on three expertly curated datasets related to depression. We observed the model to have a statistically significant (p<0.1) boost in performance over the best domain-specific model, MentalBERT, across CLEF e-Risk (25% MCC increase, 12% F1 increase). A similar trend is observed across the PRIMATE dataset, where the proposed model performed better than MentalBERT (2.5% MCC increase, 19% F1 increase). The observations confirm the generated explanations to be informative for MHPs compared to post hoc model explanations. Results demonstrated that the user-level explainability of KiNN also surpasses the performance of baseline models and can provide explanations where other baselines fall short. Infusing the domain and commonsense knowledge in KiNN enhances the ability of models like GPT-3.5 to generate application-relevant explanations.
在社交媒体上发现个人抑郁变得越来越重要。研究人员使用机器学习/深度学习(ML/DL)或词汇库方法进行自动化抑郁检测。词汇库方法,具有可解释性和易用性,匹配用户帖子中的抑郁词典中的单词,而不考虑上下文。尽管DL模型可以利用上下文信息,但它们黑盒的本质限制了它们在领域的采用。尽管像LIME和SHAP这样的代理模型可以为DL模型提供解释,但这些解释仅对开发人员有用,对最终用户来说效果有限。我们提出了一种基于知识图谱的神经网络(KiNN),该网络从DepressionFeature ontology (DFO)中提取领域特定的知识,并在神经网络中集成,以使模型具有关于医生理解的抽象概念和过程的用户级别可解释性。此外,基于Commonsense Transformer(COMET)训练的常识知识也被注入到模型中,以考虑抑郁检测中用户帖子的一般情感方面。该模型在三个相关的抑郁专家数据集上进行了评估。我们观察到,与最佳领域的模型MentalBERT相比,该模型在CLEF e-Risk(25% MCC增加,12% F1增加)上的性能有统计学显著(p <0.1)提升。类似地,在PRIMATE数据集上,所提出的模型表现优于MentalBERT(2.5% MCC增加,19% F1增加)。这些观察结果证实了生成的解释对于医生来说比后置模型解释更有价值。结果表明,KiNN的用户级别可解释性超越了基线模型,可以为像GPT-3.5这样的模型提供与基线模型不同的应用相关解释。通过向KiNN注入领域和常识知识,增强了模型像GPT-3.5这样生成与应用相关的解释的能力。
https://arxiv.org/abs/2409.02122
Developing novel predictive models with complex biomedical information is challenging due to various idiosyncrasies related to heterogeneity, standardization or sparseness of the data. We previously introduced a person-centric ontology to organize information about individual patients, and a representation learning framework to extract person-centric knowledge graphs (PKGs) and to train Graph Neural Networks (GNNs). In this paper, we propose a systematic approach to examine the results of GNN models trained with both structured and unstructured information from the MIMIC-III dataset. Through ablation studies on different clinical, demographic, and social data, we show the robustness of this approach in identifying predictive features in PKGs for the task of readmission prediction.
开发具有复杂生物医学信息的新预测模型具有挑战性,因为数据异质性、标准化或稀疏性等因素会导致各种固有问题的出现。我们之前引入了一个人本主义 Ontology,用于组织关于个体的信息,和一个表示学习框架,用于提取人本主义知识图(PKGs)并训练图神经网络(GNNs)。在本文中,我们提出了一个系统方法来检查使用MIMIC-III数据集训练的GNN模型的结果。通过在不同临床、人口统计学和社交数据上的消融研究,我们证明了这种方法在预测再入院任务中提取预测特征的鲁棒性。
https://arxiv.org/abs/2408.15294
We introduce semantic towers, an extrinsic knowledge representation method, and compare it to intrinsic knowledge in large language models for ontology learning. Our experiments show a trade-off between performance and semantic grounding for extrinsic knowledge compared to a fine-tuned model intrinsic knowledge. We report our findings on the Large Language Models for Ontology Learning (LLMs4OL) 2024 challenge.
我们提出了语义塔,这是一种外显知识表示方法,并将它与大型语言模型的内隐知识进行了比较。我们的实验结果表明,与细调整的模型内隐知识相比,外显知识在性能和语义着色方面存在权衡。我们在2024年大型语言模型知识图谱学习(LLMs4OL)挑战中报告了我们的研究结果。
https://arxiv.org/abs/2408.14236
This paper presents CodeRefine, a novel framework for automatically transforming research paper methodologies into functional code using Large Language Models (LLMs). Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph using a predefined ontology. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach. CodeRefine addresses the challenge of bridging theoretical research and practical implementation, offering a more accurate alternative to LLM zero-shot prompting. Evaluations on diverse scientific papers demonstrate CodeRefine's ability to improve code implementation from the paper, potentially accelerating the adoption of cutting-edge algorithms in real-world applications.
本文提出了一种名为CodeRefine的新框架,通过使用大型语言模型(LLMs)自动将研究论文的方法论转化为功能代码。我们的多步骤方法首先从论文中提取和总结关键文本片段,分析其代码相关性,并使用预定义的语义网络创建知识图。然后从这种结构化表示中生成代码,并通过所提出的回顾性检索增强生成方法进行增强。CodeRefine解决了将理论研究和实际实现相连接的挑战,为LLM零散提示的更准确替代提供了可行的方案。对各种科学论文的评估表明,CodeRefine能够改善代码实现,从而加速先进算法的实际应用。
https://arxiv.org/abs/2408.13366
Explainable AI (XAI) can greatly enhance user trust and satisfaction in AI-assisted decision-making processes. Recent findings suggest that a single explainer may not meet the diverse needs of multiple users in an AI system; indeed, even individual users may require multiple explanations. This highlights the necessity for a "multi-shot" approach, employing a combination of explainers to form what we introduce as an "explanation strategy". Tailored to a specific user or a user group, an "explanation experience" describes interactions with personalised strategies designed to enhance their AI decision-making processes. The iSee platform is designed for the intelligent sharing and reuse of explanation experiences, using Case-based Reasoning to advance best practices in XAI. The platform provides tools that enable AI system designers, i.e. design users, to design and iteratively revise the most suitable explanation strategy for their AI system to satisfy end-user needs. All knowledge generated within the iSee platform is formalised by the iSee ontology for interoperability. We use a summative mixed methods study protocol to evaluate the usability and utility of the iSee platform with six design users across varying levels of AI and XAI expertise. Our findings confirm that the iSee platform effectively generalises across applications and its potential to promote the adoption of XAI best practices.
可解释人工智能(XAI)在AI辅助决策过程中可以极大地增强用户的信任和满意度。最近的研究表明,一个解释器可能无法满足AI系统多个用户的需求;事实上,甚至单个用户也可能需要多个解释。这强调了需要采用“多击”方法,结合解释器形成我们称之为“解释策略”的组合。针对特定用户或用户群体,一个“解释体验”描述了与个人化策略互动以增强其AI决策过程的情况。iSee平台旨在实现智能共享和重用解释体验,使用基于案例推理的方法推动XAI的最佳实践。 平台为AI系统设计师(即设计用户)提供了工具,使他们能够设计并逐步修改其AI系统的最合适的解释策略,以满足最终用户需求。iSee平台中所有产生的知识都通过iSee本体论进行形式化,以实现与其他系统的互操作性。我们使用一种综合混合方法研究协议对六个具有不同AI和XAI专业水平的用户对iSee平台的可用性和效用进行评估。我们的研究结果证实,iSee平台在应用程序之间有效地泛化,并有可能促进采用XAI最佳实践。
https://arxiv.org/abs/2408.12941
Transcriptome foundation models TFMs hold great promises of deciphering the transcriptomic language that dictate diverse cell functions by self-supervised learning on large-scale single-cell gene expression data, and ultimately unraveling the complex mechanisms of human diseases. However, current TFMs treat cells as independent samples and ignore the taxonomic relationships between cell types, which are available in cell ontology graphs. We argue that effectively leveraging this ontology information during the TFM pre-training can improve learning biologically meaningful gene co-expression patterns while preserving TFM as a general purpose foundation model for downstream zero-shot and fine-tuning tasks. To this end, we present \textbf{s}ingle \textbf{c}ell, \textbf{Cell}-\textbf{o}ntology guided TFM scCello. We introduce cell-type coherence loss and ontology alignment loss, which are minimized along with the masked gene expression prediction loss during the pre-training. The novel loss component guide scCello to learn the cell-type-specific representation and the structural relation between cell types from the cell ontology graph, respectively. We pre-trained scCello on 22 million cells from CellxGene database leveraging their cell-type labels mapped to the cell ontology graph from Open Biological and Biomedical Ontology Foundry. Our TFM demonstrates competitive generalization and transferability performance over the existing TFMs on biologically important tasks including identifying novel cell types of unseen cells, prediction of cell-type-specific marker genes, and cancer drug responses.
转录组基础模型TFM在解析由自监督学习在大规模单细胞基因表达数据中 dictate diverse cell functions方面具有巨大的潜力,并最终揭示人类疾病中复杂机制的真相。然而,目前的TFM将细胞视为独立样本,并忽略了细胞类型之间在细胞元数据图中的分类关系。我们认为,在TFM预训练过程中有效利用这种元数据信息可以改善在生物学意义下学习基因共表达模式,同时保留TFM作为下游零散和微调任务的通用基础模型。为此,我们提出了single-cell, cell-ontology guided TFM scCello。我们引入了细胞类型和谐损失和元数据对齐损失,这些在预训练过程中与遮罩基因表达预测损失一起最小化。新损失组件guide scCello从细胞元数据图中学习细胞类型特异性表示和细胞类型之间的结构关系。我们在CellxGene数据库中使用细胞类型标签映射到细胞元数据图,进行预训练。我们的TFM在包括识别未见细胞的新细胞类型、预测细胞类型特异性标记基因和癌症药物反应等生物医学重要任务上表现出具有竞争力的泛化能力和可转移性能。
https://arxiv.org/abs/2408.12373
Anomaly detection is fundamental yet, challenging problem with practical applications in industry. The current approaches neglect the higher-order dependencies within the networks of interconnected sensors in the high-dimensional time series(multisensor data) for anomaly detection. To this end, we present a self-adapting anomaly detection framework for joint learning of (a) discrete hypergraph structure and (b) modeling the temporal trends and spatial relations among the interdependent sensors using the hierarchical encoder-decoder architecture to overcome the challenges. The hypergraph representation learning-based framework exploits the relational inductive biases in the hypergraph-structured data to learn the pointwise single-step-ahead forecasts through the self-supervised autoregressive task and predicts the anomalies based on the forecast error. Furthermore, our framework incentivizes learning the anomaly-diagnosis ontology through a differentiable approach. It derives the anomaly information propagation-based computational hypergraphs for root cause analysis and provides recommendations through an offline, optimal predictive control policy to remedy an anomaly. We conduct extensive experiments to evaluate the proposed method on the benchmark datasets for fair and rigorous comparison with the popular baselines. The proposed method outperforms the baseline models and achieves SOTA performance. We report the ablation studies to support the efficacy of the framework.
异常检测是一个基本但具有挑战性的问题,在工业领域具有实际应用。目前的解决方案忽略了高维时间序列(多传感器数据)中连接传感器网络的高级依赖关系以进行异常检测。为此,我们提出了一个自适应异常检测框架,用于联合学习(a)离散超图结构和(b)利用分层编码器-解码器架构建模相互依存的传感器之间的时序趋势和空间关系以克服挑战。基于超图表示的学习框架利用了超图结构数据中的关系归纳偏见,通过自监督自回归任务学习单步前预测,并根据预测误差预测异常。此外,我们的框架通过一种不同寻常的方法激励学习异常诊断本体论。它导出了基于异常信息传播的计算超图,用于根原因分析,并通过离线最优预测控制策略提供建议以解决异常。我们在基准数据集上进行了广泛的实验,以与流行的基线模型进行公平和严谨的比较。与基线模型相比,所提出的方法优越,并实现了SOTA性能。我们报告了消融研究以支持框架的有效性。
https://arxiv.org/abs/2408.11359
To address the challenge of automating knowledge discovery from a vast volume of literature, in this paper, we introduce a novel framework based on large language models (LLMs) that combines a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo, designed to enhance the automation of knowledge extraction from scientific articles. The POP algorithm utilizes a prioritized breadth-first search (BFS) across a predefined ontology to generate structured prompt templates and action orders, thereby guiding LLMs to discover knowledge in an automatic manner. Additionally, our LLM-Duo employs two specialized LLM agents: an explorer and an evaluator. These two agents work collaboratively and adversarially to enhance the reliability of the discovery and annotation processes. Experiments demonstrate that our method outperforms advanced baselines, enabling more accurate and complete annotations. To validate the effectiveness of our method in real-world scenarios, we employ our method in a case study of speech-language intervention discovery. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain. We curate these findings into a publicly accessible intervention knowledge base that holds significant potential to benefit the speech-language therapy community.
为了解决从大量文献中进行知识发现面临的挑战,在本文中,我们引入了一种基于大型语言模型(LLMs)的新框架,该框架将具有一个逐步本体提示(POP)算法和一个双代理系统,名为LLM-Duo,旨在增强从科学文章中提取知识的自动化程度。POP算法利用预定义的语义信息进行优先化的广度优先搜索(BFS),从而生成结构化的提示模板和动作指令,从而指导LLMs以自动方式发现知识。此外,我们的LLM-Duo采用两个专门用于LLM的代理:探索者和评估者。这两个代理以协同和对抗的方式增强发现和注释过程的可靠性。实验证明,我们的方法超越了先进基线,实现了更准确和完整的注释。为了验证我们的方法在现实场景中的有效性,我们在言语语言干预发现的一个案例研究中应用了我们的方法。我们的方法从言语语言治疗领域64,177篇研究文章中识别出2,421个干预措施。我们将这些发现放入一个公开可用的干预知识库中,该知识库对言语语言治疗社区具有很大的潜在利益。
https://arxiv.org/abs/2409.00054
The cutting edge of applying AI to science is the closed-loop automation of scientific research: robot scientists. We have previously developed two robot scientists: `Adam' (for yeast functional biology), and `Eve' (for early-stage drug design)). We are now developing a next generation robot scientist Genesis. With Genesis we aim to demonstrate that an area of science can be investigated using robot scientists unambiguously faster, and at lower cost, than with human scientists. Here we report progress on the Genesis project. Genesis is designed to automatically improve system biology models with thousands of interacting causal components. When complete Genesis will be able to initiate and execute in parallel one thousand hypothesis-led closed-loop cycles of experiment per-day. Here we describe the core Genesis hardware: the one thousand computer-controlled $\mu$-bioreactors. For the integrated Mass Spectrometry platform we have developed AutonoMS, a system to automatically run, process, and analyse high-throughput experiments. We have also developed Genesis-DB, a database system designed to enable software agents access to large quantities of structured domain information. We have developed RIMBO (Revisions for Improvements of Models in Biology Ontology) to describe the planned hundreds of thousands of changes to the models. We have demonstrated the utility of this infrastructure by developed two relational learning bioinformatic projects. Finally, we describe LGEM+ a relational learning system for the automated abductive improvement of genome-scale metabolic models.
应用人工智能(AI)到科学领域的最新前沿是对科学研究的闭合循环自动化:机器人科学家。我们之前开发了两个机器人科学家:`Adam`(用于酵母功能生物学)和 `Eve`(用于早期药物设计)。现在我们正在开发下一代机器人科学家Genesis。借助Genesis,我们希望证明使用机器人科学家可以比人类科学家更快、更经济地研究科学领域。在这里我们报道Genesis项目的进展。Genesis被设计为通过成千上万的相互作用因果组件自动改进系统生物学模型。当完成Genesis时,它将能够并行启动和执行每天每日的1000个假设实验循环。在这里我们描述了Genesis硬件的核心:1000个由计算机控制的μ生物反应器。为了实现集成质谱平台,我们开发了AutonoMS,这是一种自动运行、处理和分析高通量实验的系统。我们还开发了Genesis-DB,这是一种设计用于让软件代理访问大量结构化领域信息的数据库系统。我们开发了RIMBO(生物信息学模型改进的修订版)来描述计划对模型进行数百万次的改进。我们用这两个关系学习生物信息学项目证明了这种基础设施的实用性。最后,我们描述了LGEM+,这是一种关系学习系统,用于自动推理基因组规模代谢模型的改进。
https://arxiv.org/abs/2408.10689
The NFDI4DataScience (NFDI4DS) project aims to enhance the accessibility and interoperability of research data within Data Science (DS) and Artificial Intelligence (AI) by connecting digital artifacts and ensuring they adhere to FAIR (Findable, Accessible, Interoperable, and Reusable) principles. To this end, this poster introduces the NFDI4DS Ontology, which describes resources in DS and AI and models the structure of the NFDI4DS consortium. Built upon the NFDICore ontology and mapped to the Basic Formal Ontology (BFO), this ontology serves as the foundation for the NFDI4DS knowledge graph currently under development.
NFDI4DataScience(NFDI4DS)项目旨在通过连接数字遗产并确保其符合可发现性(Findable)、可访问性(Accessible)、可互操作性(Interoperable)和可重用性(Reusable)原则,增强数据科学(DS)和人工智能(AI)领域研究数据的可用性和互操作性。为此,这个海报介绍了NFDI4DS语义模型,它描述了DS和AI资源,并建模了NFDI4DS合作组织的结构。基于NFICore语义模型并映射到基本形式语义学(BFO),这个语义模型为目前正在开发中的NFDI4DS知识图谱奠定了基础。
https://arxiv.org/abs/2408.08698