Abstract
To perform effective causal inference in high-dimensional datasets, initiating the process with causal discovery is imperative, wherein a causal graph is generated based on observational data. However, obtaining a complete and accurate causal graph poses a formidable challenge, recognized as an NP-hard problem. Recently, the advent of Large Language Models (LLMs) has ushered in a new era, indicating their emergent capabilities and widespread applicability in facilitating causal reasoning across diverse domains, such as medicine, finance, and science. The expansive knowledge base of LLMs holds the potential to elevate the field of causal reasoning by offering interpretability, making inferences, generalizability, and uncovering novel causal structures. In this paper, we introduce a new framework, named Autonomous LLM-Augmented Causal Discovery Framework (ALCM), to synergize data-driven causal discovery algorithms and LLMs, automating the generation of a more resilient, accurate, and explicable causal graph. The ALCM consists of three integral components: causal structure learning, causal wrapper, and LLM-driven causal refiner. These components autonomously collaborate within a dynamic environment to address causal discovery questions and deliver plausible causal graphs. We evaluate the ALCM framework by implementing two demonstrations on seven well-known datasets. Experimental results demonstrate that ALCM outperforms existing LLM methods and conventional data-driven causal reasoning mechanisms. This study not only shows the effectiveness of the ALCM but also underscores new research directions in leveraging the causal reasoning capabilities of LLMs.
Abstract (translated)
在高维数据集上进行有效的因果推断,从因果发现开始是至关重要的,其中基于观测数据的因果图被生成。然而,获得完整和准确的因果图是一个具有挑战性的任务,被认为是NP难问题。最近,大型语言模型的出现引领了一个新时代,表明了它们新兴的潜力和在多个领域促进因果推理的广泛应用,如医学、金融和科学。LLM的广泛知识库具有提高因果推理领域的方法,提供可解释性、推理、一般性和发现新颖因果结构的可能性。在本文中,我们引入了一个新的框架,名为自动LLM增强因果发现框架(ALCM),以实现数据驱动的因果发现算法和LLM的协同作用,自动生成更健壮、准确和可解释的因果图。ALCM由三个基本组件组成:因果结构学习、因果外壳和LLM驱动因果细化。这些组件在动态环境中自治地合作来解决因果发现问题并生成合理的因果图。我们对ALCM框架进行了两个演示,在七个著名的数据集上进行了实验。实验结果表明,ALCM超越了现有的LLM方法和传统数据驱动因果推理机制。本研究不仅展示了ALCM的有效性,还强调了利用LLM的因果推理能力的新研究方向。
URL
https://arxiv.org/abs/2405.01744