Abstract
Intelligent vehicle systems require a deep understanding of the interplay between road conditions, surrounding entities, and the ego vehicle's driving behavior for safe and efficient navigation. This is particularly critical in developing countries where traffic situations are often dense and unstructured with heterogeneous road occupants. Existing datasets, predominantly geared towards structured and sparse traffic scenarios, fall short of capturing the complexity of driving in such environments. To fill this gap, we present IDD-X, a large-scale dual-view driving video dataset. With 697K bounding boxes, 9K important object tracks, and 1-12 objects per video, IDD-X offers comprehensive ego-relative annotations for multiple important road objects covering 10 categories and 19 explanation label categories. The dataset also incorporates rearview information to provide a more complete representation of the driving environment. We also introduce custom-designed deep networks aimed at multiple important object localization and per-object explanation prediction. Overall, our dataset and introduced prediction models form the foundation for studying how road conditions and surrounding entities affect driving behavior in complex traffic situations.
Abstract (translated)
智能车辆系统需要对道路状况、周围实体和自适应车辆的驾驶行为之间的相互作用进行深入理解,以确保安全和高效的导航。这在发展中国家尤为重要,因为交通情况往往密集且不规则,有异质的道路使用者。现有的数据集,主要针对结构和稀疏交通场景,不足以捕捉在这样的环境中的驾驶复杂性。为了填补这一空白,我们提出了IDD-X,一个大型双视驾驶视频数据集。具有697K个边界框、9K个重要对象跟踪和1-12个物体 per video,IDD-X为多个重要道路对象的全面自适应相对注释提供了全面的覆盖,涵盖了10个类别和19个解释标签类别。该数据集还包含了后方信息,以提供更完整的驾驶环境表示。我们还引入了针对多个重要对象局部定位和每对象解释预测的自定义设计深度网络。总的来说,我们的数据集和引入的预测模型是我们研究道路状况和周围实体如何影响复杂交通情况驾驶行为的基础。
URL
https://arxiv.org/abs/2404.08561