Paper Reading AI Learner

RoundaboutHD: High-Resolution Real-World Urban Environment Benchmark for Multi-Camera Vehicle Tracking

2025-07-11 16:30:27
Yuqiang Lin, Sam Lockyer, Mingxuan Sui, Li Gan, Florian Stanek, Markus Zarbock, Wenbin Li, Adrian Evans, Nic Zhang

Abstract

The multi-camera vehicle tracking (MCVT) framework holds significant potential for smart city applications, including anomaly detection, traffic density estimation, and suspect vehicle tracking. However, current publicly available datasets exhibit limitations, such as overly simplistic scenarios, low-resolution footage, and insufficiently diverse conditions, creating a considerable gap between academic research and real-world scenario. To fill this gap, we introduce RoundaboutHD, a comprehensive, high-resolution multi-camera vehicle tracking benchmark dataset specifically designed to represent real-world roundabout scenarios. RoundaboutHD provides a total of 40 minutes of labelled video footage captured by four non-overlapping, high-resolution (4K resolution, 15 fps) cameras. In total, 512 unique vehicle identities are annotated across different camera views, offering rich cross-camera association data. RoundaboutHD offers temporal consistency video footage and enhanced challenges, including increased occlusions and nonlinear movement inside the roundabout. In addition to the full MCVT dataset, several subsets are also available for object detection, single camera tracking, and image-based vehicle re-identification (ReID) tasks. Vehicle model information and camera modelling/ geometry information are also included to support further analysis. We provide baseline results for vehicle detection, single-camera tracking, image-based vehicle re-identification, and multi-camera tracking. The dataset and the evaluation code are publicly available at: this https URL

Abstract (translated)

多摄像头车辆跟踪(MCVT)框架在智慧城市应用中具有巨大潜力,包括异常检测、交通密度估算和可疑车辆追踪。然而,当前公开的数据集存在一些局限性,例如场景过于简单化、视频分辨率低以及条件多样性不足等问题,这使得学术研究与实际应用场景之间存在着相当大的差距。为了解决这一问题,我们引入了RoundaboutHD数据集——这是一个全面的多摄像头高分辨率车辆跟踪基准数据集,专门设计用于代表现实世界的环形交叉路口场景。RoundaboutHD提供了由四个非重叠、高分辨率(4K分辨率,15fps)摄像头拍摄的总计40分钟的带标签视频片段,并在不同摄像头视角中对总共512辆独特车辆进行了标注,从而提供丰富的跨摄像机关联数据。 RoundaboutHD还为多摄像头跟踪任务提供了时间一致性视频片段和增强挑战,包括增加的遮挡以及环形交叉路口内的非线性运动。除了完整的MCVT数据集之外,还包括几个子集用于物体检测、单个摄像头追踪和基于图像的车辆重识别(ReID)任务。此外还包含了关于车辆型号信息及摄像机建模/几何信息等附加支持材料以供进一步分析使用。 我们为车辆检测、单摄像机跟踪、基于图像的车辆再识别以及多摄像机跟踪提供了基准结果。该数据集和评估代码可以公开访问,网址如下:[此链接](https://this-url.com)(请将实际URL填入)。

URL

https://arxiv.org/abs/2507.08729

PDF

https://arxiv.org/pdf/2507.08729.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot