Paper Reading AI Learner

False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems

2025-07-05 19:00:27
Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira

Abstract

Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.

Abstract (translated)

网络威胁情报(CTI)作为一种至关重要的补充方法,在网络威胁生命周期的早期阶段发挥着重要作用。CTI包括收集、处理和分析威胁数据,以提供更准确和快速理解网络威胁的能力。由于数据量庞大,通过机器学习(ML)和自然语言处理(NLP)模型进行自动化处理对于有效的CTI提取至关重要。这些自动系统利用开源情报(OSINT),如社交网络、论坛和博客等来源,来识别攻击指标(IoCs)。尽管先前的研究主要关注特定ML模型的对抗性攻击,但本研究扩大了范围,调查整个CTI管道中各个组件的漏洞及其对对抗性攻击的易感性。这些漏洞出现的原因在于它们从各种开放源输入文本数据,包括真实和潜在伪造的内容。 我们分析了针对CTI管道的三种类型的攻击,即逃避、泛滥和中毒,并评估了这些攻击对系统信息选择能力的影响。具体来说,在伪造文本生成方面,该工作展示了如何通过对抗性文本生成技术创建虚假的网络安全和类似网络安全的文本,从而误导分类器,降低性能并扰乱系统功能。主要关注的是逃避攻击,因为它发生在其他两种类型(泛滥和中毒)之前,并为它们在CTI管道中的执行提供条件。

URL

https://arxiv.org/abs/2507.06252

PDF

https://arxiv.org/pdf/2507.06252.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot