Paper Reading AI Learner

Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection

2024-04-24 13:48:38
Michael Kösel, Marcel Schreiber, Michael Ulrich, Claudius Gläser, Klaus Dietmayer

Abstract

LiDAR-based 3D object detection has become an essential part of automated driving due to its ability to localize and classify objects precisely in 3D. However, object detectors face a critical challenge when dealing with unknown foreground objects, particularly those that were not present in their original training data. These out-of-distribution (OOD) objects can lead to misclassifications, posing a significant risk to the safety and reliability of automated vehicles. Currently, LiDAR-based OOD object detection has not been well studied. We address this problem by generating synthetic training data for OOD objects by perturbing known object categories. Our idea is that these synthetic OOD objects produce different responses in the feature map of an object detector compared to in-distribution (ID) objects. We then extract features using a pre-trained and fixed object detector and train a simple multilayer perceptron (MLP) to classify each detection as either ID or OOD. In addition, we propose a new evaluation protocol that allows the use of existing datasets without modifying the point cloud, ensuring a more authentic evaluation of real-world scenarios. The effectiveness of our method is validated through experiments on the newly proposed nuScenes OOD benchmark. The source code is available at this https URL.

Abstract (translated)

基于激光雷达的三维物体检测已成为自动驾驶中不可或缺的一部分,因为它能够精确地在三维中定位和分类物体。然而,在处理未知前景物体时,物体检测器面临着一个关键挑战,特别是那些它们在原始训练数据中没有的物体。这些离散(OD)物体可能导致分类错误,对自动驾驶车辆的安全和可靠性造成重大威胁。目前,基于激光雷达的OD物体检测研究得还不够充分。我们通过扰动已知物体类别生成合成训练数据来解决这个问题。我们的想法是,这些合成OD物体在物体检测器的特征图中与离散物体产生不同的响应。然后,我们使用预训练并固定的物体检测器提取特征,并训练一个简单的多层感知器(MLP)将检测结果分类为ID或OD。此外,我们提出了一个新评估协议,允许使用现有的数据集而无需修改点云,从而确保对真实世界场景进行更真实的评估。通过在 nuScenes OOD 基准上进行实验验证,验证了我们方法的有效性。源代码可在此处访问:https://www.nusensores.org/code/projects/od-detection/

URL

https://arxiv.org/abs/2404.15879

PDF

https://arxiv.org/pdf/2404.15879.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot