Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models

Abstract
Abstract (translated)
URL
PDF

Abstract

Large-scale language models (LLMs) have achieved remarkable success across various language tasks but suffer from hallucinations and temporal misalignment. To mitigate these shortcomings, Retrieval-augmented generation (RAG) has been utilized to provide external knowledge to facilitate the answer generation. However, applying such models to the medical domain faces several challenges due to the lack of domain-specific knowledge and the intricacy of real-world scenarios. In this study, we explore LLMs with RAG framework for knowledge-intensive tasks in the medical field. To evaluate the capabilities of LLMs, we introduce MedicineQA, a multi-round dialogue benchmark that simulates the real-world medication consultation scenario and requires LLMs to answer with retrieved evidence from the medicine database. MedicineQA contains 300 multi-round question-answering pairs, each embedded within a detailed dialogue history, highlighting the challenge posed by this knowledge-intensive task to current LLMs. We further propose a new \textit{Distill-Retrieve-Read} framework instead of the previous \textit{Retrieve-then-Read}. Specifically, the distillation and retrieval process utilizes a tool calling mechanism to formulate search queries that emulate the keyword-based inquiries used by search engines. With experimental results, we show that our framework brings notable performance improvements and surpasses the previous counterparts in the evidence retrieval process in terms of evidence retrieval accuracy. This advancement sheds light on applying RAG to the medical domain.

Abstract (translated)

大规模语言模型（LLMs）在各种语言任务中取得了显著的成功，但存在幻觉和时间错位等缺陷。为了减轻这些不足，检索增强生成（RAG）已被用于提供外部知识以促进答案生成。然而，将这种模型应用于医疗领域面临着多项挑战，因为缺乏领域特定知识和真实世界场景的复杂性。在这项研究中，我们探讨了具有RAG框架的大规模语言模型在医疗领域的知识密集型任务中的应用。为了评估LLMs的性能，我们引入了MedicineQA，一个多轮对话基准，模拟了真实世界药物咨询场景，并要求LLMs根据从药品数据库中检索到的证据回答问题。MedicineQA包含300个多轮问题-答案对，每个都嵌入在一个详细的对话历史中，突出了这种知识密集型任务对现有LLMs所提出的挑战。我们进一步提出了新的\textit{Distill-Retrieve-Read}框架，代替了前面的\textit{Retrieve-then-Read}框架。具体来说，差分和检索过程利用调用机制形成搜索查询，模拟搜索引擎使用的关键词基础查询。通过实验结果，我们证明了我们的框架在证据检索过程中带来了显著的性能提升，并且在证据检索准确性方面超越了前人。这一进步阐明了将RAG应用于医疗领域的重要性。

URL

https://arxiv.org/abs/2404.17897

PDF

https://arxiv.org/pdf/2404.17897.pdf

Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models

Abstract

Abstract (translated)

URL

PDF Copy

PDF