Large Language Models for Biomedical Causal Graph Construction

Abstract
Abstract (translated)
URL
PDF

Abstract

Automatic causal graph construction is of high importance in medical research. They have many applications, such as clinical trial criteria design, where identification of confounding variables is a crucial step. The quality bar for clinical applications is high, and the lack of public corpora is a barrier for such studies. Large language models (LLMs) have demonstrated impressive capabilities in natural language processing and understanding, so applying such models in clinical settings is an attractive direction, especially in clinical applications with complex relations between entities, such as diseases, symptoms and treatments. Whereas, relation extraction has already been studied using LLMs, here we present an end-to-end machine learning solution of causal relationship analysis between aforementioned entities using EMR notes. Additionally, in comparison to other studies, we demonstrate extensive evaluation of the method.

Abstract (translated)

自动因果关系图构建在医学研究中非常重要。它们有许多应用,例如临床试验标准设计,其中确定混淆变量是一个重要的步骤。临床应用的质量和标准很高,缺乏公共数据是此类研究的障碍。大型语言模型(LLMs)在自然语言处理和理解方面已经表现出令人印象深刻的能力,因此将这类模型应用于临床环境是一个有吸引力的方向,特别是在涉及实体之间复杂关系的应用,如疾病、症状和治疗方法。相比之下,关系提取已经使用LLMs进行研究,现在我们将使用EMR笔记提出一种使用这种方法进行因果关系分析的端到端机器学习解决方案。此外,与其他人的研究相比,我们证明了这种方法的全面评估。

URL

https://arxiv.org/abs/2301.12473

PDF

https://arxiv.org/pdf/2301.12473.pdf