Paper Reading AI Learner

Streaming Anomaly Detection

2023-01-30 18:59:51
Siddharth Bhatia

Abstract

Anomaly detection is critical for finding suspicious behavior in innumerable systems. We need to detect anomalies in real-time, i.e. determine if an incoming entity is anomalous or not, as soon as we receive it, to minimize the effects of malicious activities and start recovery as soon as possible. Therefore, online algorithms that can detect anomalies in a streaming manner are essential. We first propose MIDAS which uses a count-min sketch to detect anomalous edges in dynamic graphs in an online manner, using constant time and memory. We then propose two variants, MIDAS-R which incorporates temporal and spatial relations, and MIDAS-F which aims to filter away anomalous edges to prevent them from negatively affecting the internal data structures. We then extend the count-min sketch to a Higher-Order sketch to capture complex relations in graph data, and to reduce detecting suspicious dense subgraph problem to finding a dense submatrix in constant time. Using this sketch, we propose four streaming methods to detect edge and subgraph anomalies. Next, we broaden the graph setting to multi-aspect data. We propose MStream which detects explainable anomalies in multi-aspect data streams. We further propose MStream-PCA, MStream-IB, and MStream-AE to incorporate correlation between features. Finally, we consider multi-dimensional data streams with concept drift and propose MemStream. MemStream leverages the power of a denoising autoencoder to learn representations and a memory module to learn the dynamically changing trend in data without the need for labels. We prove a theoretical bound on the size of memory for effective drift handling. In addition, we allow quick retraining when the arriving stream becomes sufficiently different from the training data. Furthermore, MemStream makes use of two architecture design choices to be robust to memory poisoning.

Abstract (translated)

异常检测对于发现数量庞大的系统中的可疑行为至关重要。我们需要在实时的情况下检测异常,即一旦收到一个输入实体,就确定它是否存在异常,以最大限度地减少恶意活动的影响,并尽快开始恢复。因此,必须存在能够以流方式检测异常的在线算法。我们首先提出了MIDAS,该算法使用计数最小 sketch 在动态图中检测异常边,同时使用恒定时间和内存。我们提出了两个变体:MIDAS-R 包括时间空间关系,MIDAS-F 旨在过滤掉异常边,防止它们对内部数据结构产生负面影响。我们将计数最小 sketch 扩展为高阶 sketch,以捕捉图数据中的复杂关系,并降低检测可疑密集子图的问题,将其降低到恒定时间中找到密集子矩阵。使用这个 sketch 我们可以提出四个流方法来检测边和子异常。接下来,我们将 graph 设置扩展到多个维度。我们提出了 MStream,该算法在多个维度数据流中检测可解释的异常。我们进一步提出了 MStream-PCA、MStream-IB 和 MStream-AE,以整合特征之间的相关性。最后,我们考虑具有概念漂移的数据流,并提出了 MemStream。 MemStream 利用去噪自编码器的功率来学习表示和内存模块来学习数据的动态变化趋势,而无需标签。我们证明了内存处理的有效边界大小。此外,当输入流与训练数据足够不同时,我们允许快速重新训练。此外,MemStream 使用了两个架构设计选择,以鲁棒地应对内存中毒。

URL

https://arxiv.org/abs/2301.13199

PDF

https://arxiv.org/pdf/2301.13199.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot