Paper Reading AI Learner

Collaborative Perception Datasets for Autonomous Driving: A Review

2025-04-17 06:49:21
Naibang Wang, Deyong Shang, Yan Gong, Xiaoxi Hu, Ziying Song, Lei Yang, Yuhan Huang, Xiaoyu Wang, Jianli Lu

Abstract

Collaborative perception has attracted growing interest from academia and industry due to its potential to enhance perception accuracy, safety, and robustness in autonomous driving through multi-agent information fusion. With the advancement of Vehicle-to-Everything (V2X) communication, numerous collaborative perception datasets have emerged, varying in cooperation paradigms, sensor configurations, data sources, and application scenarios. However, the absence of systematic summarization and comparative analysis hinders effective resource utilization and standardization of model evaluation. As the first comprehensive review focused on collaborative perception datasets, this work reviews and compares existing resources from a multi-dimensional perspective. We categorize datasets based on cooperation paradigms, examine their data sources and scenarios, and analyze sensor modalities and supported tasks. A detailed comparative analysis is conducted across multiple dimensions. We also outline key challenges and future directions, including dataset scalability, diversity, domain adaptation, standardization, privacy, and the integration of large language models. To support ongoing research, we provide a continuously updated online repository of collaborative perception datasets and related literature: this https URL.

Abstract (translated)

协作感知由于其在自动驾驶中通过多智能体信息融合提高感知准确性、安全性和鲁棒性的潜力,已经引起了学术界和工业界的广泛关注。随着车辆到一切(V2X)通信技术的进步,出现了许多不同的协作感知数据集,在合作模式、传感器配置、数据来源和应用场景等方面各不相同。然而,由于缺乏系统的总结和比较分析,有效资源利用以及模型评估标准化受到了阻碍。作为首个专注于协作感知数据集的全面回顾工作,本文从多维度视角对现有资源进行了回顾和对比。我们根据合作模式对数据集进行分类,考察其数据来源和应用场景,并分析传感器模态和支持的任务类型。我们还从多个维度开展详细的比较分析,并概述了关键挑战和未来方向,包括数据集可扩展性、多样性、领域适应性、标准化、隐私保护以及大型语言模型的集成问题。为了支持持续的研究,我们提供了一个持续更新的协作感知数据集及相关文献在线资源库:[此URL]。 (请注意,最后一个链接应替换为实际提供的具体网址)

URL

https://arxiv.org/abs/2504.12696

PDF

https://arxiv.org/pdf/2504.12696.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot