Abstract
In this paper, we present a novel approach that combines deep metric learning and synthetic data generation using diffusion models for out-of-distribution (OOD) detection. One popular approach for OOD detection is outlier exposure, where models are trained using a mixture of in-distribution (ID) samples and ``seen" OOD samples. For the OOD samples, the model is trained to minimize the KL divergence between the output probability and the uniform distribution while correctly classifying the in-distribution (ID) data. In this paper, we propose a label-mixup approach to generate synthetic OOD data using Denoising Diffusion Probabilistic Models (DDPMs). Additionally, we explore recent advancements in metric learning to train our models. In the experiments, we found that metric learning-based loss functions perform better than the softmax. Furthermore, the baseline models (including softmax, and metric learning) show a significant improvement when trained with the generated OOD data. Our approach outperforms strong baselines in conventional OOD detection metrics.
Abstract (translated)
在本文中,我们提出了一种结合深度度量学习和扩散模型合成数据的新方法,用于检测离散(OD)检测。在OD检测中,一种流行的方法是离群曝光,即使用混合分布在ID样本和“见过的”OD样本上训练模型。对于OD样本,模型通过最小化输出概率与均匀分布之间的KL散度来训练,同时正确分类ID数据。在本文中,我们提出了一种使用去噪扩散概率模型(DDPMs)生成合成OD数据的标签混合方法。此外,我们探讨了最近在度量学习方面的进展,以训练我们的模型。在实验中,我们发现基于度量学习的损失函数表现更好。此外,基线模型(包括软max和度量学习)在训练时使用生成的OD数据表现出显著的改进。我们的方法在传统OD检测指标上优于强基线。
URL
https://arxiv.org/abs/2405.00631