Paper Reading AI Learner

Going Proactive and Explanatory Against Malware Concept Drift

2024-05-07 07:55:45
Yiling He, Junchi Lei, Zhan Qin, Kui Ren

Abstract

Deep learning-based malware classifiers face significant challenges due to concept drift. The rapid evolution of malware, especially with new families, can depress classification accuracy to near-random levels. Previous research has primarily focused on detecting drift samples, relying on expert-led analysis and labeling for model retraining. However, these methods often lack a comprehensive understanding of malware concepts and provide limited guidance for effective drift adaptation, leading to unstable detection performance and high human labeling costs. To address these limitations, we introduce DREAM, a novel system designed to surpass the capabilities of existing drift detectors and to establish an explanatory drift adaptation process. DREAM enhances drift detection through model sensitivity and data autonomy. The detector, trained in a semi-supervised approach, proactively captures malware behavior concepts through classifier feedback. During testing, it utilizes samples generated by the detector itself, eliminating reliance on extensive training data. For drift adaptation, DREAM enlarges human intervention, enabling revisions of malware labels and concept explanations embedded within the detector's latent space. To ensure a comprehensive response to concept drift, it facilitates a coordinated update process for both the classifier and the detector. Our evaluation shows that DREAM can effectively improve the drift detection accuracy and reduce the expert analysis effort in adaptation across different malware datasets and classifiers.

Abstract (translated)

基于深度学习的恶意分类器由于概念漂移而面临着显著的挑战。恶意软件的快速演变,特别是新家族的出现,可能会使分类准确性降低至近似随机的水平。之前的研究主要集中在检测漂移样本,依赖于专家主导的分析和模型重新训练。然而,这些方法往往缺乏对恶意软件概念的全面理解,并为有效的漂移适应提供有限指导,导致不稳定的检测性能和高的人类标注成本。为了克服这些限制,我们引入了DREAM,一种旨在超越现有漂移检测器的全新系统,以建立解释性漂移适应过程。DREAM通过模型的敏感性和数据自主性增强漂移检测。训练在半监督方法上的探测器通过分类器反馈主动捕捉恶意行为概念。在测试过程中,它利用探测器自己生成的样本,消除了对广泛训练数据的依赖。对于漂移适应,DREAM扩大了人类干预,使得对探测器隐含空间中包含的 malware 标签和概念解释进行修订。为了确保对概念漂移的全面响应,它促进了分类器和探测器之间的协同更新过程。我们的评估显示,DREAM可以有效地提高漂移检测精度,并在不同恶意软件数据集和分类器上减少专家分析工作量。

URL

https://arxiv.org/abs/2405.04095

PDF

https://arxiv.org/pdf/2405.04095.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot