Paper Reading AI Learner

Dynamic Traffic Scene Classification with Space-Time Coherence

2019-05-29 20:28:49
Athma Narayanan, Isht Dwivedi, Behzad Dariush

Abstract

This paper examines the problem of dynamic traffic scene classification under space-time variations in viewpoint that arise from video captured on-board a moving vehicle. Solutions to this problem are important for realization of effective driving assistance technologies required to interpret or predict road user behavior. Currently, dynamic traffic scene classification has not been adequately addressed due to a lack of benchmark datasets that consider spatiotemporal evolution of traffic scenes resulting from a vehicle's ego-motion. This paper has three main contributions. First, an annotated dataset is released to enable dynamic scene classification that includes 80 hours of diverse high quality driving video data clips collected in the San Francisco Bay area. The dataset includes temporal annotations for road places, road types, weather, and road surface conditions. Second, we introduce novel and baseline algorithms that utilize semantic context and temporal nature of the dataset for dynamic classification of road scenes. Finally, we showcase algorithms and experimental results that highlight how extracted features from scene classification serve as strong priors and help with tactical driver behavior understanding. The results show significant improvement from previously reported driving behavior detection baselines in the literature.

Abstract (translated)

本文从动态交通场景分类的角度,研究了时空变化下的动态交通场景分类问题。解决这个问题对于实现解释或预测道路使用者行为所需的有效驾驶辅助技术很重要。目前,由于缺乏考虑车辆自我运动导致的交通场景时空演化的基准数据集,动态交通场景分类还没有得到充分的解决。本文有三个主要贡献。首先,一个注释的数据集被释放,以实现动态场景分类,包括在旧金山湾地区收集的80小时不同的高质量驱动视频数据剪辑。数据集包括道路位置、道路类型、天气和路面条件的时间注释。其次,我们介绍了新的和基线算法,利用语义上下文和数据集的时间特性对道路场景进行动态分类。最后,我们展示了算法和实验结果,强调如何从场景分类中提取特征作为强优先级,并有助于理解战术驾驶员的行为。结果显示,与文献中先前报告的驾驶行为检测基线相比,该结果有显著改善。

URL

https://arxiv.org/abs/1905.12708

PDF

https://arxiv.org/pdf/1905.12708.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot