Abstract
Hallucination continues to be one of the most critical challenges in the institutional adoption journey of Large Language Models (LLMs). In this context, an overwhelming number of studies have focused on analyzing the post-generation phase - refining outputs via feedback, analyzing logit output values, or deriving clues via the outputs' artifacts. We propose HalluciBot, a model that predicts the probability of hallucination $\textbf{before generation}$, for any query imposed to an LLM. In essence, HalluciBot does not invoke any generation during inference. To derive empirical evidence for HalluciBot, we employ a Multi-Agent Monte Carlo Simulation using a Query Perturbator to craft $n$ variations per query at train time. The construction of our Query Perturbator is motivated by our introduction of a new definition of hallucination - $\textit{truthful hallucination}$. Our training methodology generated 2,219,022 estimates for a training corpus of 369,837 queries, spanning 13 diverse datasets and 3 question-answering scenarios. HalluciBot predicts both binary and multi-class probabilities of hallucination, enabling a means to judge the query's quality with regards to its propensity to hallucinate. Therefore, HalluciBot paves the way to revise or cancel a query before generation and the ensuing computational waste. Moreover, it provides a lucid means to measure user accountability for hallucinatory queries.
Abstract (translated)
幻觉在大型语言模型的机构采用过程中仍然是一个最重要的挑战。在这种背景下,大量研究都集中在了分析后生成阶段 - 通过反馈来优化输出,分析输出对数值,或通过输出的残骸得出提示。我们提出HalluciBot,一种预测在LLM上生成幻觉的概率的模型。本质上,HalluciBot在推理过程中没有发起任何生成。为了获得HalluciBot的实证证据,我们使用查询扰动器进行多智能体蒙特卡洛模拟,在训练时间为您构建了n个查询的变化。我们引入了一个新的定义来描述幻觉 - 真实幻觉。我们的训练方法为369,837个查询的数据集产生了2,219,022个估计,涵盖了13个不同的数据集和3个问题回答场景。HalluciBot预测幻觉的二进制和多分类概率,使得我们可以根据幻觉对查询的质量进行判断。因此,HalluciBot为在生成前修改或取消查询铺平道路,以及随之产生的计算浪费。此外,它还提供了明确衡量用户对幻觉查询的问责的途径。
URL
https://arxiv.org/abs/2404.12535