Paper Reading AI Learner

Toward Physics-Aware Deep Learning Architectures for LiDAR Intensity Simulation

2024-04-24 09:52:36
Vivek Anand, Bharat Lohani, Gaurav Pandey, Rakesh Mishra

Abstract

Autonomous vehicles (AVs) heavily rely on LiDAR perception for environment understanding and navigation. LiDAR intensity provides valuable information about the reflected laser signals and plays a crucial role in enhancing the perception capabilities of AVs. However, accurately simulating LiDAR intensity remains a challenge due to the unavailability of material properties of the objects in the environment, and complex interactions between the laser beam and the environment. The proposed method aims to improve the accuracy of intensity simulation by incorporating physics-based modalities within the deep learning framework. One of the key entities that captures the interaction between the laser beam and the objects is the angle of incidence. In this work we demonstrate that the addition of the LiDAR incidence angle as a separate input to the deep neural networks significantly enhances the results. We present a comparative study between two prominent deep learning architectures: U-NET a Convolutional Neural Network (CNN), and Pix2Pix a Generative Adversarial Network (GAN). We implemented these two architectures for the intensity prediction task and used SemanticKITTI and VoxelScape datasets for experiments. The comparative analysis reveals that both architectures benefit from the incidence angle as an additional input. Moreover, the Pix2Pix architecture outperforms U-NET, especially when the incidence angle is incorporated.

Abstract (translated)

自动驾驶车辆(AVs)对环境理解和导航重度依赖激光雷达感知。激光雷达强度提供了关于反射激光信号的有价值的信息,并在增强AV的感知能力中发挥了关键作用。然而,准确模拟激光雷达强度仍然是一个挑战,由于环境中物体的材料性质不可用,以及激光束与环境的复杂相互作用。所提出的方法旨在通过在深度学习框架中引入基于物理的模态来提高强度模拟的准确性。一个捕捉激光束与物体之间互动的关键实体是入射角。在本文中,我们证明了将激光雷达入射角作为额外的输入到深度神经网络可以显著增强结果。我们比较了两个著名的深度学习架构:U-NET和Pix2Pix。我们将这两个架构用于强度预测任务,并使用SemanticKITTI和VoxelScape数据集进行实验。比较分析揭示了,这两个架构都从入射角作为额外的输入受益。此外,Pix2Pix架构在纳入入射角时优于U-NET。

URL

https://arxiv.org/abs/2404.15774

PDF

https://arxiv.org/pdf/2404.15774.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot