Flexibility in the AI-based residential layout design remains a significant challenge, as traditional methods like rule-based heuristics and graph-based generation often lack flexibility and require substantial design knowledge from users. To address these limitations, we propose a cross-modal design approach based on the Stable Diffusion model for generating flexible residential layouts. The method offers multiple input types for learning objectives, allowing users to specify both boundaries and layouts. It incorporates natural language as design constraints and introduces ControlNet to enable stable layout generation through two distinct pathways. We also present a scheme that encapsulates design expertise within a knowledge graph and translates it into natural language, providing an interpretable representation of design knowledge. This comprehensibility and diversity of input options enable professionals and non-professionals to directly express design requirements, enhancing flexibility and controllability. Finally, experiments verify the flexibility of the proposed methods under multimodal constraints better than state-of-the-art models, even when specific semantic information about room areas or connections is incomplete.
基于AI的住宅布局设计中的灵活性仍然是一个重大挑战,因为传统的规则启发式和图生成方法往往缺乏灵活性,并且需要用户具备大量的设计知识。为了解决这些局限性,我们提出了一种基于Stable Diffusion模型的跨模态设计方案,用于生成灵活多变的住宅布局。该方法提供多种输入类型以适应不同的学习目标,允许用户指定边界条件和布局细节。它还结合了自然语言作为设计约束,并引入ControlNet来通过两种独立路径实现稳定的布局生成。 我们还提出了一种方案,将设计专业知识封装在一个知识图中,并将其转换为自然语言描述,提供可解释性的设计知识表示。这种清晰性和输入选项的多样性使得专业人士和非专业人士可以直接表达他们的设计需求,从而提高灵活性和可控性。最后,实验验证了在多模态约束条件下,所提出的方法比现有最先进的模型具有更好的灵活性,即使关于房间面积或连接的具体语义信息不完整也是如此。
https://arxiv.org/abs/2501.09279
Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often non-unique nomenclature of macromolecules. To address this challenge, we propose an agent system that integrates large language models (LLMs) and knowledge graphs (KGs). By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, our system fully automates the retrieval of relevant literatures, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways. A novel Multi-branched Reaction Pathway Search (MBRPS) algorithm enables the exploration of all pathways, with a particular focus on multi-branched ones, helping LLMs overcome weak reasoning in multi-branched paths. This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs. Applied to polyimide synthesis, our new approach constructs a retrosynthetic pathway tree with hundreds of pathways and recommends optimized routes, including both known and novel pathways, demonstrating its effectiveness and potential for broader applications.
在材料化学中,特别是在聚合物科学领域,识别可靠的合成途径是一个复杂的问题,主要是由于高分子化合物复杂的、往往不唯一的命名法。为了解决这一挑战,我们提出了一种结合大型语言模型(LLMs)和知识图谱(KGs)的智能代理系统。通过利用LLM强大的化学物质名称提取和识别能力,并将这些数据存储在结构化的知识图中,我们的系统可以全自动地检索相关文献、提取反应信息、查询数据库、构建逆合成路径树,进一步通过检索额外文献和推荐最佳反应途径来扩展路径。一种新颖的多分支反应路径搜索(MBRPS)算法使我们能够探索所有可能的路径,并特别关注多分支路径,这有助于LLM克服在处理复杂多分支路径时推理能力不足的问题。这项工作首次尝试开发了一种完全自动化的逆合成规划代理,专为大型语言模型驱动的大分子设计。 应用于聚酰亚胺合成中,我们的新方法构建了一个包含数百条路径的逆合成路径树,并推荐了优化路线,包括已知和新颖的途径,展示了其有效性和更广泛应用的潜力。
https://arxiv.org/abs/2501.08897
Traditional similarity-based schema matching methods are incapable of resolving semantic ambiguities and conflicts in domain-specific complex mapping scenarios due to missing commonsense and domain-specific knowledge. The hallucination problem of large language models (LLMs) also makes it challenging for LLM-based schema matching to address the above issues. Therefore, we propose a Knowledge Graph-based Retrieval-Augmented Generation model for Schema Matching, referred to as the KG-RAG4SM. In particular, KG-RAG4SM introduces novel vector-based, graph traversal-based, and query-based graph retrievals, as well as a hybrid approach and ranking schemes that identify the most relevant subgraphs from external large knowledge graphs (KGs). We showcase that KG-based retrieval-augmented LLMs are capable of generating more accurate results for complex matching cases without any re-training. Our experimental results show that KG-RAG4SM outperforms the LLM-based state-of-the-art (SOTA) methods (e.g., Jellyfish-8B) by 35.89% and 30.50% in terms of precision and F1 score on the MIMIC dataset, respectively; KG-RAG4SM with GPT-4o-mini outperforms the pre-trained language model (PLM)-based SOTA methods (e.g., SMAT) by 69.20% and 21.97% in terms of precision and F1 score on the Synthea dataset, respectively. The results also demonstrate that our approach is more efficient in end-to-end schema matching, and scales to retrieve from large KGs. Our case studies on the dataset from the real-world schema matching scenario exhibit that the hallucination problem of LLMs for schema matching is well mitigated by our solution.
传统的基于相似性的模式匹配方法由于缺少常识和领域特定知识,无法解决复杂映射场景中的语义模糊性和冲突。大型语言模型(LLM)的幻觉问题也使得基于LLM的模式匹配难以应对上述挑战。因此,我们提出了一种用于模式匹配的知识图谱增强检索生成模型(KG-RAG4SM)。特别地,该模型引入了新颖的向量、图遍历和查询驱动的知识图谱检索方法,以及一种混合方法和排名方案来识别从外部大规模知识图谱中提取最相关的子图。我们展示了基于KG增强的LLM无需重新训练就能为复杂匹配情况生成更准确的结果。 实验结果显示,在MIMIC数据集上,与大型语言模型(如Jellyfish-8B)相比,KG-RAG4SM在精确度和F1分数上的性能分别提高了35.89%和30.50%;使用GPT-4o-mini的KG-RAG4SM在Synthea数据集上,在精确度和F1分数方面比基于预训练语言模型(如SMAT)的方法高出69.20%和21.97%。结果还表明,我们的方法在端到端模式匹配中更加高效,并且能够从大规模知识图谱中进行检索。我们针对实际场景中的数据集进行了案例研究,结果显示LLM的幻觉问题通过我们的解决方案得到了显著缓解。 总之,KG-RAG4SM为解决复杂映射场景下的语义模糊性和冲突提供了一种有效的解决方案,并展示了在模式匹配任务上的优越性能和效率。
https://arxiv.org/abs/2501.08686
The pursuit of automated scientific discovery has fueled progress from symbolic logic to modern AI, forging new frontiers in reasoning and pattern recognition. Transformers function as potential systems, where every possible relationship remains latent potentiality until tasks impose constraints, akin to measurement. Yet, refining their sampling requires more than probabilistic selection: solutions must conform to specific structures or rules, ensuring consistency and the invocation of general principles. We present Graph-PReFLexOR (Graph-based Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning), a framework that combines graph reasoning with symbolic abstraction to dynamically expand domain knowledge. Inspired by reinforcement learning, Graph-PReFLexOR defines reasoning as a structured mapping, where tasks yield knowledge graphs, abstract patterns, and ultimately, final answers. Inspired by category theory, it encodes concepts as nodes and their relationships as edges, supporting hierarchical inference and adaptive learning through isomorphic representations. Demonstrations include hypothesis generation, materials design, and creative reasoning, such as discovering relationships between mythological concepts like 'thin places' with materials science. We propose a 'knowledge garden growth' strategy that integrates insights across domains, promoting interdisciplinary connections. Results with a 3-billion-parameter Graph-PReFLexOR model show superior reasoning depth and adaptability, underscoring the potential for transparent, multidisciplinary AI-driven discovery. It lays the groundwork for general autonomous reasoning solutions.
从符号逻辑到现代人工智能,自动化的科学发现推动了进步,并在推理和模式识别方面开辟了新的前沿领域。Transformer系统作为潜在系统运作,在任务施加约束之前,每一种可能的关系都只是潜藏的潜力。然而,改进它们的采样不仅需要概率选择:解决方案必须符合特定结构或规则,确保一致性和通用原则的应用。我们提出了一种名为Graph-PReFLexOR(基于图和偏好递归语言建模的探索性推理优化)框架,该框架结合了图形推理与符号抽象来动态扩展领域知识。受强化学习启发,Graph-PReFLexOR将推理定义为结构化映射,在这种映射中,任务会产生知识图、抽象模式,并最终生成答案。受到范畴理论的启发,它将概念编码为节点,关系作为边,通过同构表示支持分层推理和自适应学习。 该框架的应用实例包括假设生成、材料设计以及如发现神话学中的“薄界”与材料科学之间关系等创意推理。我们提出了一种“知识花园生长”策略,这种策略整合了跨学科的见解,促进了不同领域的联系。使用一个30亿参数的Graph-PReFLexOR模型进行的结果显示出了更深层次和更强适应性的推理能力,强调了透明、多学科AI驱动发现的潜力。这为通用自主推理解决方案奠定了基础。
https://arxiv.org/abs/2501.08120
Large Language Models (LLMs) have attracted a lot of attention in various fields due to their superior performance, aiming to train hundreds of millions or more parameters on large amounts of text data to understand and generate natural language. As the superior performance of LLMs becomes apparent, they are increasingly being applied to knowledge graph embedding (KGE) related tasks to improve the processing results. As a deep learning model in the field of Natural Language Processing (NLP), it learns a large amount of textual data to predict the next word or generate content related to a given text. However, LLMs have recently been invoked to varying degrees in different types of KGE related scenarios such as multi-modal KGE and open KGE according to their task characteristics. In this paper, we investigate a wide range of approaches for performing LLMs-related tasks in different types of KGE scenarios. To better compare the various approaches, we summarize each KGE scenario in a classification. In addition to the categorization methods, we provide a tabular overview of the methods and their source code links for a more direct comparison. In the article we also discuss the applications in which the methods are mainly used and suggest several forward-looking directions for the development of this new research area.
大型语言模型(LLMs)由于其卓越的性能,在多个领域吸引了广泛关注。这些模型旨在通过在大量文本数据上训练数百亿甚至更多的参数,来理解和生成自然语言。随着LLMs优越性能的显现,它们越来越多地被应用于知识图谱嵌入(KGE)相关任务中,以提高处理结果的质量。作为自然语言处理(NLP)领域的深度学习模型,LLMs通过学习大量文本数据来预测下一个单词或生成与给定文本相关的其他内容。然而,最近根据不同的任务特性,LLMs不同程度地被应用于多种类型的KGE场景,例如多模态KGE和开放型KGE等。 在本文中,我们研究了在不同类型KGE场景下执行LLM相关任务的各种方法。为了更好地对比这些不同方法,我们将每种KGE场景进行分类总结,并提供了一览表来概述各种方法及其源代码链接,以便直接比较它们的特性。此外,在文章中,我们也讨论了这些方法的主要应用场景,并提出了一些对这一新兴研究领域未来发展方向的看法和建议。
https://arxiv.org/abs/2501.07766
In the current development of large language models (LLMs), it is important to ensure the accuracy and reliability of the underlying data sources. LLMs are critical for various applications, but they often suffer from hallucinations and inaccuracies due to knowledge gaps in the training data. Knowledge graphs (KGs), as a powerful structural tool, could serve as a vital external information source to mitigate the aforementioned issues. By providing a structured and comprehensive understanding of real-world data, KGs enhance the performance and reliability of LLMs. However, it is common that errors exist in KGs while extracting triplets from unstructured data to construct KGs. This could lead to degraded performance in downstream tasks such as question-answering and recommender systems. Therefore, anomaly detection in KGs is essential to identify and correct these errors. This paper presents an anomaly detection algorithm in knowledge graphs with dual-channel learning (ADKGD). ADKGD leverages a dual-channel learning approach to enhance representation learning from both the entity-view and triplet-view perspectives. Furthermore, using a cross-layer approach, our framework integrates internal information aggregation and context information aggregation. We introduce a kullback-leibler (KL)-loss component to improve the accuracy of the scoring function between the dual channels. To evaluate ADKGD's performance, we conduct empirical studies on three real-world KGs: WN18RR, FB15K, and NELL-995. Experimental results demonstrate that ADKGD outperforms the state-of-the-art anomaly detection algorithms. The source code and datasets are publicly available at this https URL.
在大型语言模型(LLM)的当前开发中,确保基础数据源的准确性和可靠性至关重要。LLMs 对各种应用都非常重要,但由于训练数据中的知识空白,它们经常会出现幻觉和不准确性的问题。作为强大的结构化工具,知识图谱(KGs)可以充当重要的外部信息来源以缓解这些问题。通过提供对现实世界数据的结构化和全面理解,KGs 可提高 LLMs 的性能和可靠性。然而,在从非结构化数据中提取三元组来构建 KG 时,错误通常会存在,这可能导致问答和推荐系统等下游任务的表现下降。因此,知识图谱中的异常检测对于识别并纠正这些错误至关重要。 本文提出了一种在知识图谱中进行双通道学习的异常检测算法(ADKGD)。ADKGD 利用一种双通道学习方法来增强从实体视角和三元组视角的理解能力。此外,我们的框架采用跨层的方法整合内部信息聚合与上下文信息聚合。我们引入了 Kullback-Leibler (KL) 损失组件以提高两个通道之间评分函数的准确性。 为了评估 ADKGD 的性能,我们在三个真实世界的知识图谱上进行了实证研究:WN18RR、FB15K 和 NELL-995。实验结果表明 ADKGD 在异常检测算法中优于最先进的方法。源代码和数据集可在 [此链接](https://this https URL) 公开获取。
https://arxiv.org/abs/2501.07078
We analyze over 44,000 NBER and CEPR working papers from 1980 to 2023 using a custom language model to construct knowledge graphs that map economic concepts and their relationships. We distinguish between general claims and those documented via causal inference methods (e.g., DiD, IV, RDD, RCTs). We document a substantial rise in the share of causal claims-from roughly 4% in 1990 to nearly 28% in 2020-reflecting the growing influence of the "credibility revolution." We find that causal narrative complexity (e.g., the depth of causal chains) strongly predicts both publication in top-5 journals and higher citation counts, whereas non-causal complexity tends to be uncorrelated or negatively associated with these outcomes. Novelty is also pivotal for top-5 publication, but only when grounded in credible causal methods: introducing genuinely new causal edges or paths markedly increases both the likelihood of acceptance at leading outlets and long-run citations, while non-causal novelty exhibits weak or even negative effects. Papers engaging with central, widely recognized concepts tend to attract more citations, highlighting a divergence between factors driving publication success and long-term academic impact. Finally, bridging underexplored concept pairs is rewarded primarily when grounded in causal methods, yet such gap filling exhibits no consistent link with future citations. Overall, our findings suggest that methodological rigor and causal innovation are key drivers of academic recognition, but sustained impact may require balancing novel contributions with conceptual integration into established economic discourse.
我们通过一个定制的语言模型分析了1980年至2023年间超过44,000篇NBER和CEPR的工作论文,构建了经济概念及其关系的知识图谱。我们将一般性声明与通过因果推断方法(如DiD、IV、RDD、RCT等)验证的声明区分开来。我们记录到,具有因果效应声明的比例大幅上升——从1990年的约4%升至2020年的近28%,这反映了“可信度革命”的日益重要性。我们发现,因果叙事复杂程度(如因果链条的深度),强烈预示着在顶级五期刊上的发表机会和更高的引用次数,而非因果复杂的论文往往与这些结果无关联或负相关。 新颖性对于顶级五期刊的发表也是关键因素,但只有当它基于可靠的因果方法时才会如此:引入真正新的因果关系或路径显著增加了被领先出版物接受的机会及长期引用数,而没有这种因果基础的新颖性则显示出弱效果甚至负面影响。涉及中心、广受认可概念的文章往往能吸引更多的引用,这表明驱动发表成功和长期学术影响力的因素存在分歧。 最后,填补未充分探索的概念对之间的空白,在基于因果方法的情况下会被奖励,但这样的填空工作与未来的引用次数之间并没有一贯的联系。总的来说,我们的研究结果表明,方法论严谨性和因果创新是获得学术认可的关键驱动力,但是持久的影响可能需要在新颖贡献和概念整合到现有经济话语体系中保持平衡的基础上实现。
https://arxiv.org/abs/2501.06873
This paper explores the integration of provenance tracking systems within the context of Semantic Web technologies to enhance data integrity in diverse operational environments. SURROUND Australia Pty Ltd demonstrates innovative applica-tions of the PROV Data Model (PROV-DM) and its Semantic Web variant, PROV-O, to systematically record and manage provenance information across multiple data processing domains. By employing RDF and Knowledge Graphs, SURROUND ad-dresses the critical challenges of shared entity identification and provenance granularity. The paper highlights the company's architecture for capturing comprehensive provenance data, en-abling robust validation, traceability, and knowledge inference. Through the examination of two projects, we illustrate how provenance mechanisms not only improve data reliability but also facilitate seamless integration across heterogeneous systems. Our findings underscore the importance of sophisticated provenance solutions in maintaining data integrity, serving as a reference for industry peers and academics engaged in provenance research and implementation.
本文探讨了在语义网技术背景下集成起源追踪系统,以增强多变操作环境下的数据完整性。SURROUND Australia Pty Ltd展示了PROV数据模型(PROV-DM)及其语义网版本PROV-O的创新应用,用于跨多个数据处理领域的起源信息记录与管理。通过使用RDF和知识图谱,SURROUND解决了共享实体识别和起源粒度的关键挑战。本文强调了公司捕捉全面起源数据的架构,使强大的验证、可追溯性和知识推理成为可能。通过对两个项目的分析,我们展示了起源机制不仅提高了数据可靠性,还促进了异构系统间的无缝集成。我们的研究结果突显了复杂起源解决方案在维护数据完整性方面的重要性,并为从事起源研究和实施的业界同行及学术界人士提供了参考依据。
https://arxiv.org/abs/2501.09029
Much has been discussed about how Large Language Models, Knowledge Graphs and Search Engines can be combined in a synergistic manner. A dimension largely absent from current academic discourse is the user perspective. In particular, there remain many open questions regarding how best to address the diverse information needs of users, incorporating varying facets and levels of difficulty. This paper introduces a taxonomy of user information needs, which guides us to study the pros, cons and possible synergies of Large Language Models, Knowledge Graphs and Search Engines. From this study, we derive a roadmap for future research.
关于大型语言模型、知识图谱和搜索引擎如何以协同的方式结合已经进行了大量讨论。然而,当前学术界讨论中缺乏的一个重要维度是用户视角。特别是,在如何最好地满足用户的多样化信息需求方面,特别是在考虑不同方面的多样性和难度级别时,仍然存在许多开放性问题。本文介绍了一种用户信息需求的分类体系,这将指导我们研究大型语言模型、知识图谱和搜索引擎各自的优缺点以及可能的合作机会。基于这项研究,我们制定了一份未来的研究路线图。
https://arxiv.org/abs/2501.06699
This paper introduces a neuro-symbolic approach for relational exploration in cultural heritage knowledge graphs, leveraging Large Language Models (LLMs) for explanation generation and a novel mathematical framework to quantify the interestingness of relationships. We demonstrate the importance of interestingness measure using a quantitative analysis, by highlighting its impact on the overall performance of our proposed system, particularly in terms of precision, recall, and F1-score. Using the Wikidata Cultural Heritage Linked Open Data (WCH-LOD) dataset, our approach yields a precision of 0.70, recall of 0.68, and an F1-score of 0.69, representing an improvement compared to graph-based (precision: 0.28, recall: 0.25, F1-score: 0.26) and knowledge-based baselines (precision: 0.45, recall: 0.42, F1-score: 0.43). Furthermore, our LLM-powered explanations exhibit better quality, reflected in BLEU (0.52), ROUGE-L (0.58), and METEOR (0.63) scores, all higher than the baseline approaches. We show a strong correlation (0.65) between interestingness measure and the quality of generated explanations, validating its effectiveness. The findings highlight the importance of LLMs and a mathematical formalization for interestingness in enhancing the effectiveness of relational exploration in cultural heritage knowledge graphs, with results that are measurable and testable. We further show that the system enables more effective exploration compared to purely knowledge-based and graph-based methods.
本文介绍了一种神经符号方法,用于文化遗产知识图谱中的关系探索。该方法利用大型语言模型(LLMs)生成解释,并采用一种新颖的数学框架来量化关系的有趣性。我们通过定量分析展示了有趣性度量的重要性,特别强调了它对所提出系统整体性能的影响,特别是在精度、召回率和F1值方面的表现。 在使用Wikidata文化遗产链接开放数据(WCH-LOD)的数据集时,我们的方法实现了0.70的精度、0.68的召回率以及0.69的F1值,相比基于图的方法(精度:0.28,召回率:0.25,F1值:0.26)和知识基础基准方法(精度:0.45,召回率:0.42,F1值:0.43),这些结果表现出了明显的改进。此外,我们的大型语言模型驱动的解释展示了更高的质量,在BLEU(0.52)、ROUGE-L(0.58)和METEOR(0.63)评分上均优于基准方法。 我们还展示了一个强相关性(0.65),证明有趣性度量与生成解释的质量之间存在直接关联,从而验证了其有效性。研究结果强调了大型语言模型以及有趣性的数学形式化在增强文化遗产知识图谱关系探索效果中的重要性,并且这些改进的效果是可测量和可测试的。此外,我们进一步表明该系统相比纯粹基于知识和基于图的方法能够实现更为有效的探索。
https://arxiv.org/abs/2501.06628
We introduce the world's first clinical terminology for the Chinese healthcare community, namely MedCT, accompanied by a clinical foundation model MedBERT and an entity linking model MedLink. The MedCT system enables standardized and programmable representation of Chinese clinical data, successively stimulating the development of new medicines, treatment pathways, and better patient outcomes for the populous Chinese community. Moreover, the MedCT knowledge graph provides a principled mechanism to minimize the hallucination problem of large language models (LLMs), therefore achieving significant levels of accuracy and safety in LLM-based clinical applications. By leveraging the LLMs' emergent capabilities of generativeness and expressiveness, we were able to rapidly built a production-quality terminology system and deployed to real-world clinical field within three months, while classical terminologies like SNOMED CT have gone through more than twenty years development. Our experiments show that the MedCT system achieves state-of-the-art (SOTA) performance in semantic matching and entity linking tasks, not only for Chinese but also for English. We also conducted a longitudinal field experiment by applying MedCT and LLMs in a representative spectrum of clinical tasks, including electronic health record (EHR) auto-generation and medical document search for diagnostic decision making. Our study shows a multitude of values of MedCT for clinical workflows and patient outcomes, especially in the new genre of clinical LLM applications. We present our approach in sufficient engineering detail, such that implementing a clinical terminology for other non-English societies should be readily reproducible. We openly release our terminology, models and algorithms, along with real-world clinical datasets for the development.
我们介绍了专为中国医疗社区打造的世界首个临床术语系统MedCT,同时推出了与之配套的临床基础模型MedBERT和实体链接模型MedLink。MedCT系统能够实现中国临床数据的标准和程序化表示,从而推动新药研发、治疗路径优化以及提升庞大华人社群的患者治疗效果。此外,MedCT知识图谱提供了一种原理性机制来最小化大型语言模型(LLMs)中的幻觉问题,因此在基于LLM的临床应用中实现了显著的准确性和安全性水平。 通过利用LLM产生的能力和表现力,我们仅用了三个月的时间就建立了一个生产质量术语系统,并将其部署到了实际的临床环境中,而像SNOMED CT这样的经典术语系统则经历了超过二十年的发展历程。我们的实验表明,MedCT系统在语义匹配和实体链接任务中实现了最先进的(SOTA)性能,不仅适用于中文,也适用于英文。 我们还进行了一项长期田野实验,在包括电子健康记录自动生成及诊断决策中的医疗文档搜索在内的代表性临床任务范围内应用了MedCT和LLM。我们的研究表明,MedCT系统对于临床工作流程以及患者结果具有多方面的价值,尤其是在新兴的临床LLM应用场景中更为显著。 我们将详细阐述我们的方法,并提供了足够的工程细节,使得其他非英语社会实施此类临床术语系统变得易于再现。我们公开发布了我们的术语、模型和算法,以及用于开发的真实世界临床数据集。
https://arxiv.org/abs/2501.06465
The growing demand for halal cosmetic products has exposed significant challenges, especially in Muslim-majority countries. Recently, various machine learning-based strategies, e.g., image-based methods, have shown remarkable success in predicting the halal status of cosmetics. However, these methods mainly focus on analyzing the discrete and specific ingredients within separate cosmetics, which ignore the high-order and complex relations between cosmetics and ingredients. To address this problem, we propose a halal cosmetic recommendation framework, namely HaCKG, that leverages a knowledge graph of cosmetics and their ingredients to explicitly model and capture the relationships between cosmetics and their components. By representing cosmetics and ingredients as entities within the knowledge graph, HaCKG effectively learns the high-order and complex relations between entities, offering a robust method for predicting halal status. Specifically, we first construct a cosmetic knowledge graph representing the relations between various cosmetics, ingredients, and their properties. We then propose a pre-trained relational graph attention network model with residual connections to learn the structural relation between entities in the knowledge graph. The pre-trained model is then fine-tuned on downstream cosmetic data to predict halal status. Extensive experiments on the cosmetic dataset over halal prediction tasks demonstrate the superiority of our model over state-of-the-art baselines.
对清真化妆品产品不断增长的需求已经暴露出了显著的挑战,特别是在穆斯林人口占多数的国家。最近,基于机器学习的各种策略(例如图像识别方法)在预测化妆品的清真状态方面取得了令人瞩目的成功。然而,这些方法主要集中在分析单独化妆品中的离散和特定成分上,忽视了化妆品与成分之间高阶和复杂的关联。为了解决这个问题,我们提出了一种名为HaCKG(Halal Cosmetic Knowledge Graph)的框架,该框架利用化妆品及其成分的知识图谱来明确地建模并捕捉化妆品与其组成之间的关系。通过在知识图谱中将化妆品和成分表示为实体,HaCKG有效地学习了这些实体之间高阶和复杂的关联,从而提供了一种预测清真状态的强大方法。 具体而言,我们首先构建了一个包含各种化妆品、成分及其属性之间关系的化妆品知识图谱。然后,我们提出了一个具有残差连接的预训练关系图注意力网络模型来学习知识图中的结构化关系。之后,在下游化妆品数据上对预训练模型进行微调以预测清真状态。在涉及清真预测任务的化妆品数据集上的广泛实验表明,我们的模型优于最先进的基线模型。
https://arxiv.org/abs/2501.05768
Question answering over temporal knowledge graphs (TKGs) is crucial for understanding evolving facts and relationships, yet its development is hindered by limited datasets and difficulties in generating custom QA pairs. We propose a novel categorization framework based on timeline-context relationships, along with \textbf{TimelineKGQA}, a universal temporal QA generator applicable to any TKGs. The code is available at: \url{this https URL} as an open source Python package.
基于时间的知识图谱(TKGs)上的问答对于理解不断演变的事实和关系至关重要,然而其发展受限于数据集有限以及生成自定义问题回答对的难度。我们提出了一种基于时间线-上下文关系的新分类框架,并提出了\textbf{TimelineKGQA},这是一种适用于任何TKGs的通用时间问答生成器。该项目的代码作为开源Python包可以在[\url{this https URL}]获取。
https://arxiv.org/abs/2501.04343
Recently, violence detection systems developed using unified multimodal models have achieved significant success and attracted widespread attention. However, most of these systems face two critical challenges: the lack of interpretability as black-box models and limited functionality, offering only classification or retrieval capabilities. To address these challenges, this paper proposes a novel interpretable violence detection system, termed the Three-in-One (TIO) System. The TIO system integrates knowledge graphs (KG) and graph attention networks (GAT) to provide three core functionalities: detection, retrieval, and explanation. Specifically, the system processes each video frame along with text descriptions generated by a large language model (LLM) for videos containing potential violent behavior. It employs ImageBind to generate high-dimensional embeddings for constructing a knowledge graph, uses GAT for reasoning, and applies lightweight time series modules to extract video embedding features. The final step connects a classifier and retriever for multi-functional outputs. The interpretability of KG enables the system to verify the reasoning process behind each output. Additionally, the paper introduces several lightweight methods to reduce the resource consumption of the TIO system and enhance its efficiency. Extensive experiments conducted on the XD-Violence and UCF-Crime datasets validate the effectiveness of the proposed system. A case study further reveals an intriguing phenomenon: as the number of bystanders increases, the occurrence of violent behavior tends to decrease.
最近,使用统一多模态模型开发的暴力检测系统取得了显著成功并引起了广泛关注。然而,大多数这些系统面临着两个关键挑战:作为黑盒模型缺乏可解释性以及功能有限,仅提供分类或检索能力。为解决这些问题,本文提出了一种新颖的可解释性暴力检测系统,称为三合一(TIO)系统。该TIO系统整合了知识图谱(KG)和图注意力网络(GAT),以实现三大核心功能:检测、检索与解释。 具体来说,该系统处理每个视频帧以及由大型语言模型(LLM)生成的潜在暴力行为描述文本。它利用ImageBind生成高维嵌入以构建知识图谱,并使用GAT进行推理,同时应用轻量级时间序列模块提取视频嵌入特征。最后一步则是连接分类器和检索器以实现多功能输出。 KG的可解释性使得系统能够验证每个输出背后的原因过程。此外,论文还介绍了几种轻量化方法来减少TIO系统的资源消耗并提高其效率。在XD-Violence和UCF-Crime数据集上进行的广泛实验验证了所提出系统的有效性。一项案例研究进一步揭示了一个有趣的发现:随着旁观者数量的增加,暴力行为的发生率往往趋于下降。
https://arxiv.org/abs/2501.06224
Knowledge graphs have proven successful in integrating heterogeneous data across various domains. However, there remains a noticeable dearth of research on their seamless integration among heterogeneous recommender systems, despite knowledge graph-based recommender systems garnering extensive research attention. This study aims to fill this gap by proposing RecKG, a standardized knowledge graph for recommender systems. RecKG ensures the consistent representation of entities across different datasets, accommodating diverse attribute types for effective data integration. Through a meticulous examination of various recommender system datasets, we select attributes for RecKG, ensuring standardized formatting through consistent naming conventions. By these characteristics, RecKG can seamlessly integrate heterogeneous data sources, enabling the discovery of additional semantic information within the integrated knowledge graph. We apply RecKG to standardize real-world datasets, subsequently developing an application for RecKG using a graph database. Finally, we validate RecKG's achievement in interoperability through a qualitative evaluation between RecKG and other studies.
知识图谱已经在整合不同领域的异构数据方面取得了成功。然而,尽管基于知识图谱的推荐系统得到了广泛的研究关注,但它们在异构推荐系统中的无缝集成方面的研究仍然明显不足。本研究旨在通过提出RecKG(面向推荐系统的标准化知识图谱)来填补这一空白。RecKG确保了不同数据集中实体的一致表示,并且能够适应各种属性类型以实现有效的数据整合。通过对多种推荐系统数据集的详细考察,我们为RecKG选择了适当的属性,并通过一致的命名约定实现了格式标准化。凭借这些特性,RecKG可以无缝地集成异构数据源,在整合后的知识图谱中发现额外的语义信息。 我们将使用RecKG对现实世界的数据集进行标准化处理,并随后开发了一个基于图数据库的应用程序来实现RecKG的功能。最后,我们通过定性评估验证了RecKG在互操作性方面的成就与其他研究相比的表现。
https://arxiv.org/abs/2501.03598
The role of large language models (LLMs) in enterprise modeling has recently started to shift from academic research to that of industrial applications. Thereby, LLMs represent a further building block for the machine-supported generation of enterprise models. In this paper we employ a knowledge graph-based approach for enterprise modeling and investigate the potential benefits of LLMs in this context. In addition, the findings of an expert survey and ChatGPT-4o-based experiments demonstrate that LLM-based model generations exhibit minimal variability, yet remain constrained to specific tasks, with reliability declining for more intricate tasks. The survey results further suggest that the supervision and intervention of human modeling experts are essential to ensure the accuracy and integrity of the generated models.
大型语言模型(LLMs)在企业建模中的角色最近从学术研究转向了工业应用。因此,LLMs成为机器支持的企业模型生成的进一步构建模块。本文采用基于知识图的方法进行企业建模,并探讨LLMs在此领域的潜在优势。此外,专家调查和基于ChatGPT-4o的实验结果表明,基于LLM的模型生成具有较低的变异性,但仍然局限于特定任务,并且对于更复杂的任务,其可靠性会下降。调查结果进一步表明,人类建模专家的监督和干预是确保生成模型准确性和完整性的关键。
https://arxiv.org/abs/2501.03566
Multilingual knowledge graphs (KGs) provide high-quality relational and textual information for various NLP applications, but they are often incomplete, especially in non-English languages. Previous research has shown that combining information from KGs in different languages aids either Knowledge Graph Completion (KGC), the task of predicting missing relations between entities, or Knowledge Graph Enhancement (KGE), the task of predicting missing textual information for entities. Although previous efforts have considered KGC and KGE as independent tasks, we hypothesize that they are interdependent and mutually beneficial. To this end, we introduce KG-TRICK, a novel sequence-to-sequence framework that unifies the tasks of textual and relational information completion for multilingual KGs. KG-TRICK demonstrates that: i) it is possible to unify the tasks of KGC and KGE into a single framework, and ii) combining textual information from multiple languages is beneficial to improve the completeness of a KG. As part of our contributions, we also introduce WikiKGE10++, the largest manually-curated benchmark for textual information completion of KGs, which features over 25,000 entities across 10 diverse languages.
多语言知识图谱(KG)为各种自然语言处理应用提供了高质量的关系和文本信息,但它们往往不完整,尤其是在非英语语言中。先前的研究表明,结合不同语言中的KG信息有助于完成知识图谱补全(KGC),即预测实体之间的缺失关系任务,或知识图谱增强(KGE),即预测实体的缺失文本信息任务。尽管以前的努力将KGC和KGE视为独立的任务,但我们假设它们是相互依赖且互惠互利的。为此,我们引入了KG-TRICK,这是一种新颖的序列到序列框架,统一了多语言KG中文字和关系信息完成的任务。通过这项工作,KG-TRICK证明了: i) 可以将KGC和KGE任务整合到一个单一框架内; ii) 结合多种语言中的文本信息有助于提高知识图谱的完整性。 作为我们的贡献的一部分,我们还引入了WikiKGE10++,这是一个用于KG文本信息完成的最大手动策划基准测试集,涵盖了超过25,000个实体和10种多样化的语言。
https://arxiv.org/abs/2501.03560
Large Language Models (LLMs) have shown impressive performance in various tasks, including knowledge graph completion (KGC). However, current studies mostly apply LLMs to classification tasks, like identifying missing triplets, rather than ranking-based tasks, where the model ranks candidate entities based on plausibility. This focus limits the practical use of LLMs in KGC, as real-world applications prioritize highly plausible triplets. Additionally, while graph paths can help infer the existence of missing triplets and improve completion accuracy, they often contain redundant information. To address these issues, we propose KG-CF, a framework tailored for ranking-based KGC tasks. KG-CF leverages LLMs' reasoning abilities to filter out irrelevant contexts, achieving superior results on real-world datasets. The code and datasets are available at \url{this https URL}.
大型语言模型(LLMs)在包括知识图谱补全(KGC)在内的多种任务中展现了卓越的表现。然而,目前的研究主要将LLM应用于分类任务,例如识别缺失的三元组,而不是基于排名的任务,后者要求模型根据合理性对候选实体进行排序。这种侧重限制了LLM在KGC中的实际应用,因为现实世界的应用更注重高度合理的三元组。此外,尽管图路径有助于推断缺失三元组的存在并提高补全准确性,但它们常常包含冗余信息。为了解决这些问题,我们提出了KG-CF框架,该框架专用于基于排名的KGC任务。KG-CF利用LLM的推理能力来过滤掉不相关的上下文,在现实世界的数据集上取得了更优的结果。代码和数据集可在[此处](https://this https URL)获取。
https://arxiv.org/abs/2501.02711
Efficient prediction of default risk for bond-issuing enterprises is pivotal for maintaining stability and fostering growth in the bond market. Conventional methods usually rely solely on an enterprise's internal data for risk assessment. In contrast, graph-based techniques leverage interconnected corporate information to enhance default risk identification for targeted bond issuers. Traditional graph techniques such as label propagation algorithm or deepwalk fail to effectively integrate a enterprise's inherent attribute information with its topological network data. Additionally, due to data scarcity and security privacy concerns between enterprises, end-to-end graph neural network (GNN) algorithms may struggle in delivering satisfactory performance for target tasks. To address these challenges, we present a novel two-stage model. In the first stage, we employ an innovative Masked Autoencoders for Heterogeneous Graph (HGMAE) to pre-train on a vast enterprise knowledge graph. Subsequently, in the second stage, a specialized classifier model is trained to predict default risk propagation probabilities. The classifier leverages concatenated feature vectors derived from the pre-trained encoder with the enterprise's task-specific feature vectors. Through the two-stage training approach, our model not only boosts the importance of unique bond characteristics for specific default prediction tasks, but also securely and efficiently leverage the global information pre-trained from other enterprises. Experimental results demonstrate that our proposed model outperforms existing approaches in predicting default risk for bond issuers.
对企业发行债券的违约风险进行高效预测对于维护市场的稳定性和促进增长至关重要。传统方法通常仅依赖企业的内部数据来进行风险管理评估。相比之下,基于图的方法利用企业间相互关联的信息来增强对特定发债企业的违约风险识别能力。然而,传统的图形技术如标签传播算法或deepwalk无法有效地将企业的内在属性信息与其拓扑网络数据相结合。此外,由于企业之间存在数据稀缺和隐私安全问题,端到端的图神经网络(GNN)算法在执行目标任务时可能难以达到令人满意的表现。 为解决这些问题,我们提出了一种新颖的两阶段模型。第一阶段中,采用创新性的异构图掩码自动编码器(HGMAE),预先训练于庞大的企业知识图谱上。随后,在第二阶段,专门用于预测违约风险传播概率的分类器模型进行训练。该分类器利用预训练编码器生成的特征向量与特定任务相关的特征向量连接起来。通过两阶段训练方法,我们的模型不仅提升了对独特债券特性在具体违约预测任务中的重要性,还能够安全、有效地从其他企业预先训练的数据中提取全局信息。 实验结果显示,我们提出的方法在预测发债企业的违约风险方面优于现有的所有方法。
https://arxiv.org/abs/2501.03268
As large language models (LLMs) evolve, their ability to deliver personalized and context-aware responses offers transformative potential for improving user experiences. Existing personalization approaches, however, often rely solely on user history to augment the prompt, limiting their effectiveness in generating tailored outputs, especially in cold-start scenarios with sparse data. To address these limitations, we propose Personalized Graph-based Retrieval-Augmented Generation (PGraphRAG), a framework that leverages user-centric knowledge graphs to enrich personalization. By directly integrating structured user knowledge into the retrieval process and augmenting prompts with user-relevant context, PGraphRAG enhances contextual understanding and output quality. We also introduce the Personalized Graph-based Benchmark for Text Generation, designed to evaluate personalized text generation tasks in real-world settings where user history is sparse or unavailable. Experimental results show that PGraphRAG significantly outperforms state-of-the-art personalization methods across diverse tasks, demonstrating the unique advantages of graph-based retrieval for personalization.
随着大型语言模型(LLMs)的发展,它们提供个性化和情境感知响应的能力为改善用户体验带来了变革性的潜力。然而,现有的个性化方法通常仅依赖用户历史记录来增强提示信息,在数据稀疏的冷启动场景下,这种方法的有效性受到限制,难以生成定制化的输出。为了克服这些局限,我们提出了Personalized Graph-based Retrieval-Augmented Generation(PGraphRAG),这是一个框架,利用以用户为中心的知识图谱来丰富个性化过程。通过直接将结构化用户知识整合到检索过程中,并用与用户相关的情境增强提示信息,PGraphRAG增强了上下文理解并提高了输出质量。 我们还引入了Personalized Graph-based Benchmark for Text Generation(PGTB),这是一个评估文本生成任务中个性化方法的基准测试工具,在真实场景下,特别是在用户历史记录稀疏或不可用的情况下使用。实验结果表明,PGraphRAG在各种任务上都显著优于最先进的个性化方法,展示了基于图谱检索的个性化方法的独特优势。
https://arxiv.org/abs/2501.02157