Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding

Abstract
Abstract (translated)
URL
PDF

Abstract

Large language models (LLMs) tend to inadequately integrate input context during text generation, relying excessively on encoded prior knowledge in model parameters, potentially resulting in generated text with factual inconsistencies or contextually unfaithful content. LLMs utilize two primary knowledge sources: 1) prior (parametric) knowledge from pretraining, and 2) contextual (non-parametric) knowledge from input prompts. The study addresses the open question of how LLMs effectively balance these knowledge sources during the generation process, specifically in the context of open-domain question answering. To address this issue, we introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples to enhance robust context grounding during generation. Notably, our method operates at inference time without requiring further training. We conduct comprehensive experiments to demonstrate its applicability and effectiveness, providing empirical evidence showcasing its superiority over existing methodologies. Our code is publicly available at: this https URL.

Abstract (translated)

大语言模型（LLMs）在文本生成过程中往往不足以很好地整合输入上下文，过分依赖模型参数中的编码先验知识，可能导致生成的文本存在事实不一致或上下文不忠实的内容。LLM利用两个主要知识来源：1）预训练中的先验（参数）知识，2）输入提示中的上下文（非参数）知识。本研究回答了一个开放性问题：LLM在生成过程中如何有效地平衡这些知识来源，尤其是在开放领域问题回答的背景下。为解决这个问题，我们引入了一种新颖的方法，将对比性解码与对抗无关段落作为负样本，以增强生成过程中的上下文接地。值得注意的是，我们的方法在推理过程中无需进一步训练。我们进行了全面的实验来证明其应用性和有效性，提供了其优于现有方法的实证证据。我们的代码可在以下这个链接公开使用：https:// this URL。

URL

https://arxiv.org/abs/2405.02750

PDF

https://arxiv.org/pdf/2405.02750.pdf

Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding

Abstract

Abstract (translated)

URL

PDF Copy

PDF