Paper Reading AI Learner

Spatial-temporal Fusion Convolutional Neural Network for Simulated Driving Behavior Recognition

2018-12-03 09:18:18
Yaocong Hu, MingQi Lu, Xiaobo Lu

Abstract

Abnormal driving behaviour is one of the leading cause of terrible traffic accidents endangering human life. Therefore, study on driving behaviour surveillance has become essential to traffic security and public management. In this paper, we conduct this promising research and employ a two stream CNN framework for video-based driving behaviour recognition, in which spatial stream CNN captures appearance information from still frames, whilst temporal stream CNN captures motion information with pre-computed optical flow displacement between a few adjacent video frames. We investigate different spatial-temporal fusion strategies to combine the intra frame static clues and inter frame dynamic clues for final behaviour recognition. So as to validate the effectiveness of the designed spatial-temporal deep learning based model, we create a simulated driving behaviour dataset, containing 1237 videos with 6 different driving behavior for recognition. Experiment result shows that our proposed method obtains noticeable performance improvements compared to the existing methods.

Abstract (translated)

URL

https://arxiv.org/abs/1812.00615

PDF

https://arxiv.org/pdf/1812.00615.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot