Abstract
Large language models (LLMs) are highly capable of many tasks but they can sometimes generate unreliable or inaccurate outputs. To tackle this issue, this paper studies the problem of uncertainty estimation and calibration for LLMs. We begin by formulating the uncertainty estimation problem for LLMs and then propose a supervised approach that takes advantage of the labeled datasets and estimates the uncertainty of the LLMs' responses. Based on the formulation, we illustrate the difference between the uncertainty estimation for LLMs and that for standard ML models and explain why the hidden activations of the LLMs contain uncertainty information. Our designed approach effectively demonstrates the benefits of utilizing hidden activations for enhanced uncertainty estimation across various tasks and shows robust transferability in out-of-distribution settings. Moreover, we distinguish the uncertainty estimation task from the uncertainty calibration task and show that a better uncertainty estimation mode leads to a better calibration performance. In practice, our method is easy to implement and is adaptable to different levels of model transparency including black box, grey box, and white box, each demonstrating strong performance based on the accessibility of the LLM's internal mechanisms.
Abstract (translated)
大语言模型(LLMs)具有许多任务的丰富能力,但有时它们可能生成不可靠或不准确的结果。为解决这个问题,本文研究了LLMs的不确定性估计和校准问题。我们首先形式化LLMs的不确定性估计问题,然后提出了一种监督方法,该方法利用已标记的数据集并估计LLMs的响应不确定性。根据公式,我们阐明了LLMs和标准机器学习模型不确定性估计之间的差异,并解释了LLMs隐藏激活中包含不确定性信息的原因。我们设计的方法有效地证明了利用隐藏激活增强不确定估计在各种任务中的优势,并在离散设置中展示了鲁棒性。此外,我们区分了不确定性估计任务和不确定性校准任务,并表明更好的不确定性估计模式会导致更好的校准性能。在实践中,我们的方法易于实现,并适用于包括黑盒、灰盒和白盒在内的不同模型透明度级别,每个级别都基于LLM内部机制的可访问性表现出强大的性能。
URL
https://arxiv.org/abs/2404.15993