Abstract
We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) that utilizes Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty enhances model calibration, improving robustness and mitigating semantic inconsistencies introduced by random chunking. Leveraging this insight, we propose an efficient unsupervised learning technique to train the retrieval model, alongside an effective data sampling and scaling strategy. UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results while using only 4% of the training data compared to other advanced open-source retrieval models under distribution shift settings. Our method demonstrates strong calibration through span uncertainty, leading to improved generalization and robustness in long-context RAG tasks. Additionally, UncertaintyRAG provides a lightweight retrieval model that can be integrated into any large language model with varying context window lengths, without the need for fine-tuning, showcasing the flexibility of our approach.
Abstract (translated)
我们提出了UncertaintyRAG,一种新颖的用于长上下文检索增强生成(RAG)的方法,该方法利用信号噪声比(SNR)基于范围不确定性来估计文本片段之间的相似度。这个范围不确定性增强了模型的校准,提高了稳健性,并减轻了由随机片段化带来的语义不一致性。借此启示,我们提出了一种有效的无监督学习方法来训练检索模型,并搭配有效的数据抽样和扩展策略。UncertaintyRAG在LLaMA-2-7B上的性能比基线提高了2.03%,在分布平移设置下,只使用了训练数据中的4%,即可实现与其它高级开源检索模型的最佳结果。我们的方法通过范围不确定性表现出强大的校准效果,从而在长上下文RAG任务中提高了泛化能力和稳健性。此外,UncertaintyRAG提供了一个轻量级的检索模型,可以集成到具有不同上下文窗口长度的任何大型语言模型中,无需进行微调,展示了我们方法的灵活性。
URL
https://arxiv.org/abs/2410.02719