Abstract
Efficient processing of long contexts has been a persistent pursuit in Natural Language Processing. With the growing number of long documents, dialogues, and other textual data, it is important to develop Long Context Language Models (LCLMs) that can process and analyze extensive inputs in an effective and efficient way. In this paper, we present a comprehensive survey on recent advances in long-context modeling for large language models. Our survey is structured around three key aspects: how to obtain effective and efficient LCLMs, how to train and deploy LCLMs efficiently, and how to evaluate and analyze LCLMs comprehensively. For the first aspect, we discuss data strategies, architectural designs, and workflow approaches oriented with long context processing. For the second aspect, we provide a detailed examination of the infrastructure required for LCLM training and inference. For the third aspect, we present evaluation paradigms for long-context comprehension and long-form generation, as well as behavioral analysis and mechanism interpretability of LCLMs. Beyond these three key aspects, we thoroughly explore the diverse application scenarios where existing LCLMs have been deployed and outline promising future development directions. This survey provides an up-to-date review of the literature on long-context LLMs, which we wish to serve as a valuable resource for both researchers and engineers. An associated GitHub repository collecting the latest papers and repos is available at: \href{this https URL}{\color[RGB]{175,36,67}{LCLM-Horizon}}.
Abstract (translated)
在自然语言处理领域,高效地处理长文本一直是研究人员追求的目标。随着大量长文档、对话及其他类型文本数据的不断涌现,开发能够有效且高效地处理和分析大规模输入的长上下文语言模型(Long Context Language Models, LCLMs)变得尤为重要。本文将对近年来LCLM领域的最新进展进行全面综述。我们的调研围绕三个关键方面展开:如何获得有效的长上下文语言模型、如何有效地训练和部署这些模型,以及如何全面评估和分析它们的性能。 对于第一部分,我们讨论了用于处理长文本的数据策略、架构设计以及工作流程方法。 第二部分,则详细探讨了LCLM训练与推理所需要的基础设施。 第三部分,提出了评估长上下文理解和大规模生成任务的方法,并介绍了对行为分析及机制可解释性的研究。 除此之外,本调研还深入探索了现有LCLMs已部署的各种应用场景,并概述了未来有前景的发展方向。这篇综述提供了关于长上下文语言模型的最新文献回顾,旨在为研究人员和工程师提供有价值的参考资源。我们为此调研建立了一个关联的GitHub仓库来收集最新的论文及相关代码库:[LCLM-Horizon](https://this https URL)(链接颜色设置为RGB值(175,36,67),即深珊瑚色)。
URL
https://arxiv.org/abs/2503.17407