Paper Reading AI Learner

Improved AutoEncoder with LSTM module and KL divergence

2024-04-30 04:11:21
Wei Huang, Bingyang Zhang, Kaituo Zhang, Hua Gao, Rongchun Wan

Abstract

The task of anomaly detection is to separate anomalous data from normal data in the dataset. Models such as deep convolutional autoencoder (CAE) network and deep supporting vector data description (SVDD) model have been universally employed and have demonstrated significant success in detecting anomalies. However, the over-reconstruction ability of CAE network for anomalous data can easily lead to high false negative rate in detecting anomalous data. On the other hand, the deep SVDD model has the drawback of feature collapse, which leads to a decrease of detection accuracy for anomalies. To address these problems, we propose the Improved AutoEncoder with LSTM module and Kullback-Leibler divergence (IAE-LSTM-KL) model in this paper. An LSTM network is added after the encoder to memorize feature representations of normal data. In the meanwhile, the phenomenon of feature collapse can also be mitigated by penalizing the featured input to SVDD module via KL divergence. The efficacy of the IAE-LSTM-KL model is validated through experiments on both synthetic and real-world datasets. Experimental results show that IAE-LSTM-KL model yields higher detection accuracy for anomalies. In addition, it is also found that the IAE-LSTM-KL model demonstrates enhanced robustness to contaminated outliers in the dataset.

Abstract (translated)

异常检测的任务是将数据集中的异常数据与正常数据区分开来。像深度卷积自动编码器(CAE)网络和深度支持向量数据描述(SVDD)模型这样的模型已经被普遍采用,并在检测异常数据方面取得了显著的成功。然而,CAE网络对异常数据的过度重构能力可能导致在检测异常数据时的假阴性率过高。另一方面,深度SVDD模型的缺点是特征收缩,这会导致异常检测的准确性降低。为了应对这些问题,本文提出了改进自动编码器(IAE-LSTM-KL)模型。在编码器之后添加一个LSTM网络来记忆正常数据的特征表示。同时,通过KL散度惩罚SVDD模块中的特征输入,也可以减轻特征收缩的现象。通过实验验证,IAE-LSTM-KL模型的有效性得到了 both synthetic and real-world datasets 的验证。实验结果表明,IAE-LSTM-KL模型对于异常数据的检测准确率更高。此外,还发现IAE-LSTM-KL模型在数据集中的污染异常检测方面表现出更高的鲁棒性。

URL

https://arxiv.org/abs/2404.19247

PDF

https://arxiv.org/pdf/2404.19247.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot