Abstract
Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance.
Abstract (translated)
攻击知识图构建旨在将文本形式的网络威胁情报(CTI)报告转换为结构化的表示形式,描绘网络攻击的演变轨迹。尽管之前的研究提出了各种方法来构建攻击知识图,但它们通常都存在对不同知识类型的泛化能力有限以及模型设计和调整的要求。为解决这些限制,我们寻求利用大型语言模型(LLMs),因为它们在广泛的任务上取得了巨大的成功,并且在语言理解和零击任务满足方面具有卓越的能力。因此,我们提出了一个完全自动化的LLM-为基础的攻击知识图构建框架,名为:AttacKG+。 我们的框架由四个连续的模块组成:改写器、解析器、标识器和总结器,每个模块都通过指令提示和上下文学习由LLM实现。此外,我们升级了现有的攻击知识模式并提出了全面版本。我们用一个时间展开的事件来表示网络攻击,每个时间步都包含三层表示,包括行为图、MITRE TTP标签和状态概述。丰富的评估表明:1)我们的公式在威胁事件分析中无缝地满足信息需求,2)我们的构建框架有效地忠实并准确地提取了由AttacKG+定义的信息,3)我们的攻击图直接受益于下游安全实践,如攻击重建。所有代码和数据将在接受提交时发布。
URL
https://arxiv.org/abs/2405.04753