Paper Reading AI Learner

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation

2023-03-14 09:03:54
Yijin Li, Zhaoyang Huang, Shuo Chen, Xiaoyu Shi, Hongsheng Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

Abstract

Event cameras provide high temporal precision, low data rates, and high dynamic range visual perception, which are well-suited for optical flow estimation. While data-driven optical flow estimation has obtained great success in RGB cameras, its generalization performance is seriously hindered in event cameras mainly due to the limited and biased training data. In this paper, we present a novel simulator, BlinkSim, for the fast generation of large-scale data for event-based optical flow. BlinkSim consists of a configurable rendering engine and a flexible engine for event data simulation. By leveraging the wealth of current 3D assets, the rendering engine enables us to automatically build up thousands of scenes with different objects, textures, and motion patterns and render very high-frequency images for realistic event data simulation. Based on BlinkSim, we construct a large training dataset and evaluation benchmark BlinkFlow that contains sufficient, diversiform, and challenging event data with optical flow ground truth. Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40% on average and up to 90%. Moreover, we further propose an Event optical Flow transFormer (E-FlowFormer) architecture. Powered by our BlinkFlow, E-FlowFormer outperforms the SOTA methods by up to 91% on MVSEC dataset and 14% on DSEC dataset and presents the best generalization performance.

Abstract (translated)

事件相机提供高时间精度、低数据速率、高动态范围的视觉感知,非常适合用于光学流估计。虽然基于数据的光学流估计在RGB相机中取得了巨大的成功,但在事件相机中其泛化性能却受到了严重的阻碍,主要是因为训练数据的限制和偏差。在本文中,我们介绍了一种新模拟器BlinkSim,用于快速生成基于事件事件的光学流大规模数据。BlinkSim由一个可配置渲染引擎和一个灵活的引擎组成。通过利用当前3D资产的丰富资源,渲染引擎使我们能够自动构建数千个场景,包括不同的物体、纹理和运动模式,并渲染非常高频的图像,以进行真实的事件数据模拟。基于BlinkSim,我们构建了一个大规模的训练数据和评估基准BlinkFlow,其中包含了足够的、多样化和具有挑战性的事件数据,并使用光学流真实先验。实验表明,BlinkFlow平均提高了现有方法的泛化性能超过40%,高达90%。此外,我们还提出了一个事件光学流转换器(E-Flow former)架构。通过我们的BlinkFlow,E-Flowformer在MVSEC数据集上比SOTA方法高出91%,在DSEC数据集上高出14%,表现出最佳泛化性能。

URL

https://arxiv.org/abs/2303.07716

PDF

https://arxiv.org/pdf/2303.07716.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot