Abstract
As the rapidly advancing domain of natural language processing (NLP), large language models (LLMs) have emerged as powerful tools for interpreting human commands and generating text across various tasks. Nonetheless, the resilience of LLMs to handle text containing inherent errors, stemming from human interactions and collaborative systems, has not been thoroughly explored. Our study investigates the resilience of LLMs against five common types of disruptions including 1) ASR (Automatic Speech Recognition) errors, 2) OCR (Optical Character Recognition) errors, 3) grammatical mistakes, 4) typographical errors, and 5) distractive content. We aim to investigate how these models react by deliberately embedding these errors into instructions. Our findings reveal that while some LLMs show a degree of resistance to certain types of noise, their overall performance significantly suffers. This emphasizes the importance of further investigation into enhancing model resilience. In response to the observed decline in performance, our study also evaluates a "re-pass" strategy, designed to purify the instructions of noise before the LLMs process them. Our analysis indicates that correcting noisy instructions, particularly for open-source LLMs, presents significant challenges.
Abstract (translated)
随着自然语言处理(NLP)领域迅速发展,大型语言模型(LLMs)已成为解读人类指令和生成各种任务的强大工具。然而,LLMs对处理包含固有错误的文本以及协作系统产生的文本的抵抗力尚未得到充分探讨。我们的研究调查了LLMs对五种常见干扰类型的抵抗力,包括1)自动语音识别(ASR)错误,2)光学字符识别(OCR)错误,3)语法错误,4)排版错误,5)分散的内容。我们旨在通过故意将这些错误嵌入指令中,研究模型对这些干扰的反应。我们的研究结果表明,虽然某些LLM对某些类型的噪音表现出一定程度的抵抗力,但它们的整体性能严重下降。这强调了进一步研究增强模型韧性的重要性。为了应对观察到的性能下降,我们的研究还评估了一种“重新通过”策略,旨在在LLMs处理指令之前净化指令中的噪音。我们的分析表明,修复噪音指令,特别是对于开源LLM,带来了显著的挑战。
URL
https://arxiv.org/abs/2404.09754