Paper Reading AI Learner

Advancing Autonomous Driving System Testing: Demands, Challenges, and Future Directions

2025-12-09 06:33:27
Yihan Liao, Jingyu Zhang, Jacky Keung, Yan Xiao, Yurou Dai

Abstract

Autonomous driving systems (ADSs) promise improved transportation efficiency and safety, yet ensuring their reliability in complex real-world environments remains a critical challenge. Effective testing is essential to validate ADS performance and reduce deployment risks. This study investigates current ADS testing practices for both modular and end-to-end systems, identifies key demands from industry practitioners and academic researchers, and analyzes the gaps between existing research and real-world requirements. We review major testing techniques and further consider emerging factors such as Vehicle-to-Everything (V2X) communication and foundation models, including large language models and vision foundation models, to understand their roles in enhancing ADS testing. We conducted a large-scale survey with 100 participants from both industry and academia. Survey questions were refined through expert discussions, followed by quantitative and qualitative analyses to reveal key trends, challenges, and unmet needs. Our results show that existing ADS testing techniques struggle to comprehensively evaluate real-world performance, particularly regarding corner case diversity, the simulation to reality gap, the lack of systematic testing criteria, exposure to potential attacks, practical challenges in V2X deployment, and the high computational cost of foundation model-based testing. By further analyzing participant responses together with 105 representative studies, we summarize the current research landscape and highlight major limitations. This study consolidates critical research gaps in ADS testing and outlines key future research directions, including comprehensive testing criteria, cross-model collaboration in V2X systems, cross-modality adaptation for foundation model-based testing, and scalable validation frameworks for large-scale ADS evaluation.

Abstract (translated)

自动驾驶系统(ADS)承诺提高交通效率和安全,但确保其在复杂现实环境中的可靠性仍然是一个关键挑战。有效的测试对于验证ADS性能及减少部署风险至关重要。本研究探讨了当前针对模块化与端到端系统的自动驾驶系统测试实践,识别了来自行业从业者和学术研究人员的关键需求,并分析了现有研究与实际需求之间的差距。我们回顾了主要的测试技术,并进一步考虑新兴因素如车到一切(V2X)通信及基础模型(包括大型语言模型和视觉基础模型),以理解它们在增强自动驾驶系统测试中的作用。我们对来自行业和学术界的100名参与者进行了大规模调查,调查问卷通过专家讨论进行完善后,再通过定量与定性分析揭示了关键趋势、挑战以及未满足的需求。我们的结果显示,现有的ADS测试技术难以全面评估现实世界的性能,尤其是在边缘案例多样性、仿真到实际的差距、缺乏系统的测试标准、面对潜在攻击的暴露度、V2X部署的实际挑战及基于基础模型测试的高度计算成本方面。 通过对参与者回复与105项代表性研究进行进一步分析后,我们总结了当前的研究景观,并突出了主要限制。这项研究整合了自动驾驶系统测试中的关键研究差距,并概述了未来的关键研究方向,包括全面的测试标准、V2X系统的跨模型协作、基于基础模型测试的多模态适应及大规模ADS评估的可扩展验证框架。

URL

https://arxiv.org/abs/2512.11887

PDF

https://arxiv.org/pdf/2512.11887.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot