Paper Reading AI Learner

Supervised Anomaly Detection for Complex Industrial Images

2024-05-08 10:47:28
Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Abstract

Automating visual inspection in industrial production lines is essential for increasing product quality across various industries. Anomaly detection (AD) methods serve as robust tools for this purpose. However, existing public datasets primarily consist of images without anomalies, limiting the practical application of AD methods in production settings. To address this challenge, we present (1) the Valeo Anomaly Dataset (VAD), a novel real-world industrial dataset comprising 5000 images, including 2000 instances of challenging real defects across more than 20 subclasses. Acknowledging that traditional AD methods struggle with this dataset, we introduce (2) Segmentation-based Anomaly Detector (SegAD). First, SegAD leverages anomaly maps as well as segmentation maps to compute local statistics. Next, SegAD uses these statistics and an optional supervised classifier score as input features for a Boosted Random Forest (BRF) classifier, yielding the final anomaly score. Our SegAD achieves state-of-the-art performance on both VAD (+2.1% AUROC) and the VisA dataset (+0.4% AUROC). The code and the models are publicly available.

Abstract (translated)

在工业生产线上自动化视觉检查对于提高产品质量至关重要,异常检测(AD)方法作为这一目的的有力工具显得至关重要。然而,现有的公共数据集主要包含没有异常的图像,这限制了AD方法在生产环境中的实际应用。为解决这个问题,我们提出了(1)Valeo异常数据集(VAD),这是一个由5000个图像组成的新兴工业现实数据集,包括2000个具有超过20个亚类的具有挑战性的真实缺陷实例。承认传统AD方法在这个数据集上挣扎,我们引入了(2)基于分段的异常检测器(SegAD)。首先,SegAD利用异常图和分割图计算局部统计。接下来,SegAD将这些统计量作为输入特征输入到Boosted Random Forest(BRF)分类器中,产生最终的异常得分。我们的SegAD在VAD (+2.1% AUROC)和VisA数据集 (+0.4% AUROC)上实现了最先进的性能。代码和模型公开可用。

URL

https://arxiv.org/abs/2405.04953

PDF

https://arxiv.org/pdf/2405.04953.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot