Paper Reading AI Learner

TMA: Temporal Motion Aggregation for Event-based Optical Flow

2023-03-21 06:51:31
Haotian Liu, Guang Chen, Sanqing Qu, Yanping Zhang, Zhijun Li, Alois Knoll, Changjun Jiang

Abstract

Event cameras have the ability to record continuous and detailed trajectories of objects with high temporal resolution, thereby providing intuitive motion cues for optical flow estimation. Nevertheless, most existing learning-based approaches for event optical flow estimation directly remould the paradigm of conventional images by representing the consecutive event stream as static frames, ignoring the inherent temporal continuity of event data. In this paper, we argue that temporal continuity is a vital element of event-based optical flow and propose a novel Temporal Motion Aggregation (TMA) approach to unlock its potential. Technically, TMA comprises three components: an event splitting strategy to incorporate intermediate motion information underlying the temporal context, a linear lookup strategy to align temporally continuous motion features and a novel motion pattern aggregation module to emphasize consistent patterns for motion feature enhancement. By incorporating temporally continuous motion information, TMA can derive better flow estimates than existing methods at early stages, which not only enables TMA to obtain more accurate final predictions, but also greatly reduces the demand for a number of refinements. Extensive experiments on DESC-Flow and MVSEC datasets verify the effectiveness and superiority of our TMA. Remarkably, compared to E-RAFT, TMA achieves a 6% improvement in accuracy and a 40% reduction in inference time on DSEC-Flow.

Abstract (translated)

事件相机能够以高时间分辨率记录物体连续且详细的轨迹,从而为光学流估计提供直观的运动线索。然而,大多数现有的事件光学流估计方法直接套用传统图像的范式,将连续的事件流表示为静态帧,忽略了事件数据固有的时间连续性。在本文中,我们认为时间连续性是事件基于光学流的重要元素,并提出了一种新的时间运动聚合方法以解锁其潜力。技术上,TMA由三个组件组成:事件分割策略以纳入隐含的时间上下文中的中间运动信息、线性查找策略以对齐时间连续运动特征,以及一种新的运动模式聚合模块以强调一致的运动模式以提高运动特征增强。通过纳入时间连续运动信息,TMA可以在早期阶段比现有方法得出更好的流估计结果,这不仅使TMA能够更准确地做出最终预测,还大大降低了对许多改进的需求。在 DESC-Flow 和 MVSEC 数据集上进行广泛的实验验证了我们的TMA方法的有效性和优越性。令人惊讶地,与 E-RAFT 相比,TMA在 DSEC-Flow 上实现了6%的精度改进和40%的推理时间减少。

URL

https://arxiv.org/abs/2303.11629

PDF

https://arxiv.org/pdf/2303.11629.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot