Abstract
The Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools, effectively functioning as the "USB-C for Agentic AI." While this decoupling of context and execution solves critical interoperability challenges, it introduces a profound new threat landscape where the boundary between epistemic errors (hallucinations) and security breaches (unauthorized actions) dissolves. This Systematization of Knowledge (SoK) aims to provide a comprehensive taxonomy of risks in the MCP ecosystem, distinguishing between adversarial security threats (e.g., indirect prompt injection, tool poisoning) and epistemic safety hazards (e.g., alignment failures in distributed tool delegation). We analyze the structural vulnerabilities of MCP primitives, specifically Resources, Prompts, and Tools, and demonstrate how "context" can be weaponized to trigger unauthorized operations in multi-agent environments. Furthermore, we survey state-of-the-art defenses, ranging from cryptographic provenance (ETDI) to runtime intent verification, and conclude with a roadmap for securing the transition from conversational chatbots to autonomous agentic operating systems.
Abstract (translated)
模型上下文协议(MCP)已成为将大型语言模型(LLMs)与外部数据和工具连接起来的事实上的标准,有效充当了“代理人工智能的USB-C”。尽管这种解耦背景信息和执行的方式解决了关键的互操作性挑战,但它引入了一个全新的威胁环境,在这个环境中,知识性的错误(幻觉)和安全漏洞(未经授权的操作)之间的界限变得模糊。本文系统地提供了MCP生态系统中风险的全面分类,区分了对抗性安全威胁(例如,间接提示注入、工具中毒)与认知安全危险(例如,在分布式工具委派中的对齐失败)。我们分析了MCP基本元素——资源、提示和工具——的结构性漏洞,并展示了“上下文”如何在多代理环境中被武器化以触发未经授权的操作。此外,我们还调研了最先进的防御措施,从加密原产地验证(ETDI)到运行时意图验证,最后提出了一个路线图,旨在确保从对话聊天机器人过渡到自主操作系统的安全过程。
URL
https://arxiv.org/abs/2512.08290