Abstract
The task of anomaly detection is to separate anomalous data from normal data in the dataset. Models such as deep convolutional autoencoder (CAE) network and deep supporting vector data description (SVDD) model have been universally employed and have demonstrated significant success in detecting anomalies. However, the over-reconstruction ability of CAE network for anomalous data can easily lead to high false negative rate in detecting anomalous data. On the other hand, the deep SVDD model has the drawback of feature collapse, which leads to a decrease of detection accuracy for anomalies. To address these problems, we propose the Improved AutoEncoder with LSTM module and Kullback-Leibler divergence (IAE-LSTM-KL) model in this paper. An LSTM network is added after the encoder to memorize feature representations of normal data. In the meanwhile, the phenomenon of feature collapse can also be mitigated by penalizing the featured input to SVDD module via KL divergence. The efficacy of the IAE-LSTM-KL model is validated through experiments on both synthetic and real-world datasets. Experimental results show that IAE-LSTM-KL model yields higher detection accuracy for anomalies. In addition, it is also found that the IAE-LSTM-KL model demonstrates enhanced robustness to contaminated outliers in the dataset.
Abstract (translated)
异常检测的任务是将数据集中的异常数据与正常数据区分开来。像深度卷积自动编码器(CAE)网络和深度支持向量数据描述(SVDD)模型这样的模型已经被普遍采用,并在检测异常数据方面取得了显著的成功。然而,CAE网络对异常数据的过度重构能力可能导致在检测异常数据时的假阴性率过高。另一方面,深度SVDD模型的缺点是特征收缩,这会导致异常检测的准确性降低。为了应对这些问题,本文提出了改进自动编码器(IAE-LSTM-KL)模型。在编码器之后添加一个LSTM网络来记忆正常数据的特征表示。同时,通过KL散度惩罚SVDD模块中的特征输入,也可以减轻特征收缩的现象。通过实验验证,IAE-LSTM-KL模型的有效性得到了 both synthetic and real-world datasets 的验证。实验结果表明,IAE-LSTM-KL模型对于异常数据的检测准确率更高。此外,还发现IAE-LSTM-KL模型在数据集中的污染异常检测方面表现出更高的鲁棒性。
URL
https://arxiv.org/abs/2404.19247