Paper Reading AI Learner

RaLiBEV: Radar and LiDAR BEV Fusion Learning for Anchor Box Free Object Detection System

2022-11-11 10:24:42
Yanlong Yang, Jianan Liu, Tao Huang, Qing-Long Han, Gang Ma, Bing Zhu

Abstract

Radar, the only sensor that could provide reliable perception capability in all weather conditions at an affordable cost, has been widely accepted as a key supplement to camera and LiDAR in modern advanced driver assistance systems (ADAS) and autonomous driving systems. Recent state-of-the-art works reveal that fusion of radar and LiDAR can lead to robust detection in adverse weather, such as fog. However, these methods still suffer from low accuracy of bounding box estimations. This paper proposes a bird's-eye view (BEV) fusion learning for an anchor box-free object detection system, which uses the feature derived from the radar range-azimuth heatmap and the LiDAR point cloud to estimate the possible objects. Different label assignment strategies have been designed to facilitate the consistency between the classification of foreground or background anchor points and the corresponding bounding box regressions. Furthermore, the performance of the proposed object detector can be further enhanced by employing a novel interactive transformer module. We demonstrated the superior performance of the proposed methods in this paper using the recently published Oxford Radar RobotCar (ORR) dataset. We showed that the accuracy of our system significantly outperforms the other state-of-the-art methods by a large margin.

Abstract (translated)

URL

https://arxiv.org/abs/2211.06108

PDF

https://arxiv.org/pdf/2211.06108.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot