Abstract
Large language models (LLMs) show early signs of artificial general intelligence but struggle with hallucinations. One promising solution to mitigate these hallucinations is to store external knowledge as embeddings, aiding LLMs in retrieval-augmented generation. However, such a solution risks compromising privacy, as recent studies experimentally showed that the original text can be partially reconstructed from text embeddings by pre-trained language models. The significant advantage of LLMs over traditional pre-trained models may exacerbate these concerns. To this end, we investigate the effectiveness of reconstructing original knowledge and predicting entity attributes from these embeddings when LLMs are employed. Empirical findings indicate that LLMs significantly improve the accuracy of two evaluated tasks over those from pre-trained models, regardless of whether the texts are in-distribution or out-of-distribution. This underscores a heightened potential for LLMs to jeopardize user privacy, highlighting the negative consequences of their widespread use. We further discuss preliminary strategies to mitigate this risk.
Abstract (translated)
大语言模型(LLMs)显示出早期的人工通用智能迹象,但在幻觉方面遇到困难。一种减轻这些幻觉的潜在解决方案是将外部知识存储为嵌入,有助于LLMs在检索增强生成。然而,这样的解决方案可能危及隐私,因为最近的研究表明,通过预训练语言模型可以部分重构原始文本。LLM与传统预训练模型的显著优势可能会加剧这些担忧。因此,我们研究了在LLM被应用时,从这些嵌入中恢复原始知识和预测实体属性的有效性。 实证发现表明,无论文本是否在分布内,LLM在两个评估任务中的准确率都显著高于预训练模型。这表明LLM显著提高了两个评估任务的准确性,无论这些文本是否在分布内。这凸出了LLM对用户隐私可能造成的威胁,并突出了其在广泛使用时可能产生的负面后果。我们进一步讨论了减轻这种风险的初步策略。
URL
https://arxiv.org/abs/2404.16587