Paper Reading AI Learner

Multi-feature Reconstruction Network using Crossed-mask Restoration for Unsupervised Anomaly Detection

2024-04-20 05:13:56
Junpu Wang, Guili Xu, Chunlei Li, Guangshuai Gao, Yuehua Cheng

Abstract

Unsupervised anomaly detection using only normal samples is of great significance for quality inspection in industrial manufacturing. Although existing reconstruction-based methods have achieved promising results, they still face two problems: poor distinguishable information in image reconstruction and well abnormal regeneration caused by model over-generalization ability. To overcome the above issues, we convert the image reconstruction into a combination of parallel feature restorations and propose a multi-feature reconstruction network, MFRNet, using crossed-mask restoration in this paper. Specifically, a multi-scale feature aggregator is first developed to generate more discriminative hierarchical representations of the input images from a pre-trained model. Subsequently, a crossed-mask generator is adopted to randomly cover the extracted feature map, followed by a restoration network based on the transformer structure for high-quality repair of the missing regions. Finally, a hybrid loss is equipped to guide model training and anomaly estimation, which gives consideration to both the pixel and structural similarity. Extensive experiments show that our method is highly competitive with or significantly outperforms other state-of-the-arts on four public available datasets and one self-made dataset.

Abstract (translated)

无监督异常检测仅使用正常样本在工业制造业的质量检测中具有重要意义。尽管现有的基于重构的方法已经取得了很好的效果,但它们仍然面临两个问题:图像重构中的信息模糊和由模型过拟合能力引起的异常再生。为解决这些问题,本文将图像重构转化为跨模态恢复的组合,并提出了一种多特征重构网络,MFRNet,使用双向掩码恢复。具体来说,首先开发了一个多尺度特征聚合器,从预训练模型中生成输入图像的更有区分性的层次表示。然后,采用双向掩码生成器随机覆盖提取的特征图,接着是基于Transformer结构的高质量修补缺失区域的修复网络。最后,配备了一种混合损失函数来指导模型训练和异常估计,考虑了像素结构和相似性。大量实验证明,我们的方法在四个公开可用数据集和自建数据集上与其他最先进的水平具有高度竞争性或显著优于其他方法。

URL

https://arxiv.org/abs/2404.13273

PDF

https://arxiv.org/pdf/2404.13273.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot