Paper Reading AI Learner

Complementary Pseudo Multimodal Feature for Point Cloud Anomaly Detection

2023-03-23 11:52:17
Yunkang Cao, Xiaohao Xu, Weiming Shen

Abstract

Point cloud (PCD) anomaly detection steadily emerges as a promising research area. This study aims to improve PCD anomaly detection performance by combining handcrafted PCD descriptions with powerful pre-trained 2D neural networks. To this end, this study proposes Complementary Pseudo Multimodal Feature (CPMF) that incorporates local geometrical information in 3D modality using handcrafted PCD descriptors and global semantic information in the generated pseudo 2D modality using pre-trained 2D neural networks. For global semantics extraction, CPMF projects the origin PCD into a pseudo 2D modality containing multi-view images. These images are delivered to pre-trained 2D neural networks for informative 2D modality feature extraction. The 3D and 2D modality features are aggregated to obtain the CPMF for PCD anomaly detection. Extensive experiments demonstrate the complementary capacity between 2D and 3D modality features and the effectiveness of CPMF, with 95.15% image-level AU-ROC and 92.93% pixel-level PRO on the MVTec3D benchmark. Code is available on this https URL.

Abstract (translated)

点云(PCD)异常检测逐渐成为一个有前途的研究领域。本研究旨在通过结合手工编写的点云描述与强大的预训练2D神经网络,提高PCD异常检测性能。为此,本研究提出了互补的伪多模态特征(CPMF),该特征使用手工编写的点云描述将3D模态中的 local 几何信息与使用预训练2D神经网络生成的伪2D模态中的 global 语义信息相结合。为了获取全局语义信息,CPMF将点云的起源点转换为包含多视角图像的伪2D模态。这些图像被发送到预训练2D神经网络进行2D模态信息 informative 特征提取。3D和2D模态特征的聚合得到了PCD异常检测的CPMF。广泛的实验结果表明,2D和3D模态特征之间的互补能力以及CPMF的有效性,在MVTec3D基准测试中,PCD异常检测的性能达到95.15%。代码可在本https URL上获取。

URL

https://arxiv.org/abs/2303.13194

PDF

https://arxiv.org/pdf/2303.13194.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot