Paper Reading AI Learner

LaneCorrect: Self-supervised Lane Detection

2024-04-23 01:55:09
Ming Nie, Xinyue Cai, Hang Xu, Li Zhang

Abstract

Lane detection has evolved highly functional autonomous driving system to understand driving scenes even under complex environments. In this paper, we work towards developing a generalized computer vision system able to detect lanes without using any annotation. We make the following contributions: (i) We illustrate how to perform unsupervised 3D lane segmentation by leveraging the distinctive intensity of lanes on the LiDAR point cloud frames, and then obtain the noisy lane labels in the 2D plane by projecting the 3D points; (ii) We propose a novel self-supervised training scheme, dubbed LaneCorrect, that automatically corrects the lane label by learning geometric consistency and instance awareness from the adversarial augmentations; (iii) With the self-supervised pre-trained model, we distill to train a student network for arbitrary target lane (e.g., TuSimple) detection without any human labels; (iv) We thoroughly evaluate our self-supervised method on four major lane detection benchmarks (including TuSimple, CULane, CurveLanes and LLAMAS) and demonstrate excellent performance compared with existing supervised counterpart, whilst showing more effective results on alleviating the domain gap, i.e., training on CULane and test on TuSimple.

Abstract (translated)

车道检测已经发展成为高度功能自动驾驶系统,以在复杂环境中理解驾驶场景。在本文中,我们致力于开发一个通用计算机视觉系统,能够无需使用任何标注来检测车道。我们做出以下贡献:(一)通过利用LIDAR点云帧中车道独特的强度进行无监督的三维车道分割,然后通过投影获取二维平面上的噪音车道标签;(二)我们提出了一种新颖的自监督训练方案,称为LaneCorrect,通过学习来自对抗增强的几何一致性和实例意识来自动纠正车道标签;(三)在自监督预训练模型的基础上,我们通过训练学生网络来检测任意目标车道(例如TuSimple)而无需任何人类标签;(四)我们在包括TuSimple、CULane、CurveLanes和LLAMAS在内的四个主要车道检测基准上对自监督方法进行了全面评估,并证明了与现有监督方法相比具有卓越的性能,同时表现出在减轻领域差异方面的更有效结果,即在CULane上训练并在TuSimple上测试。

URL

https://arxiv.org/abs/2404.14671

PDF

https://arxiv.org/pdf/2404.14671.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot