Abstract
Out-of-distribution (OOD) detection is critical for the safe deployment of machine learning systems. Existing post-hoc detectors typically rely on model confidence scores or likelihood estimates in feature space, often under restrictive distributional assumptions. In this work, we introduce a third paradigm and formulate OOD detection from a diversity perspective. We propose the Vendi Novelty Score (VNS), an OOD detector based on the Vendi Scores (VS), a family of similarity-based diversity metrics. VNS quantifies how much a test sample increases the VS of the in-distribution feature set, providing a principled notion of novelty that does not require density modeling. VNS is linear-time, non-parametric, and naturally combines class-conditional (local) and dataset-level (global) novelty signals. Across multiple image classification benchmarks and network architectures, VNS achieves state-of-the-art OOD detection performance. Remarkably, VNS retains this performance when computed using only 1% of the training data, enabling deployment in memory- or access-constrained settings.
Abstract (translated)
出界检测(OOD,Out-of-distribution)对于机器学习系统的安全部署至关重要。现有的事后检测器通常依赖于模型置信度得分或特征空间中的似然估计,这往往需要严格的分布假设。在本工作中,我们引入了一种新的范式,并从多样性视角出发来定义OOD检测问题。我们提出了Vendi Novelty Score(VNS),这是一种基于Vendi Scores(VS)的OOD检测器,而VS是一组相似性为基础的多样性度量指标。VNS量化了测试样本如何增加在分布特征集中的VS值,提供了一种无需密度建模即可确定新颖性的原则方法。VNS具有线性时间复杂度、非参数性质,并自然地结合了类条件(局部)和数据集级别(全局)的新颖性信号。在多个图像分类基准测试及网络架构上,VNS实现了最先进的OOD检测性能。值得注意的是,在仅使用1%的训练数据进行计算的情况下,VNS仍能保持这种性能水平,从而可以在内存受限或访问受限的环境中部署。
URL
https://arxiv.org/abs/2602.10062