Abstract
Due to the unsupervised nature of anomaly detection, the key to fueling deep models is finding supervisory signals. Different from current reconstruction-guided generative models and transformation-based contrastive models, we devise novel data-driven supervision for tabular data by introducing a characteristic -- scale -- as data labels. By representing varied sub-vectors of data instances, we define scale as the relationship between the dimensionality of original sub-vectors and that of representations. Scales serve as labels attached to transformed representations, thus offering ample labeled data for neural network training. This paper further proposes a scale learning-based anomaly detection method. Supervised by the learning objective of scale distribution alignment, our approach learns the ranking of representations converted from varied subspaces of each data instance. Through this proxy task, our approach models inherent regularities and patterns within data, which well describes data "normality". Abnormal degrees of testing instances are obtained by measuring whether they fit these learned patterns. Extensive experiments show that our approach leads to significant improvement over state-of-the-art generative/contrastive anomaly detection methods.
Abstract (translated)
由于异常检测的无监督性质,驱动深度模型的关键在于找到监督信号。与当前基于重构引导的生成模型和基于转换的比较模型不同,我们提出了一种新的基于数据驱动的监督方法,将特征——尺寸——作为数据标签。通过表示数据实例的不同子向量,我们定义尺寸为原始子向量维度与表示维度之间的关系。尺寸作为转换表示的标签,为神经网络训练提供了大量的标记数据。本文还提出了基于尺寸学习的异常检测方法。在尺寸分布对齐的学习目标的监督下,我们的算法学习从每个数据实例的不同子空间中转换表示的排名。通过这个代理任务,我们的算法模型数据内在的规律性和模式,很好地描述了数据“正常”性。测试实例的异常程度可以通过测量是否适应这些学习模式来获得。广泛的实验表明,我们的算法比当前最先进的生成/对比异常检测方法取得了显著的改进。
URL
https://arxiv.org/abs/2305.16114