Abstract
Recent learning-based methods for event-based optical flow estimation utilize cost volumes for pixel matching but suffer from redundant computations and limited scalability to higher resolutions for flow refinement. In this work, we take advantage of the complementarity between temporally dense feature differences of adjacent event frames and cost volume and present a lightweight event-based optical flow network (EDCFlow) to achieve high-quality flow estimation at a higher resolution. Specifically, an attention-based multi-scale temporal feature difference layer is developed to capture diverse motion patterns at high resolution in a computation-efficient manner. An adaptive fusion of high-resolution difference motion features and low-resolution correlation motion features is performed to enhance motion representation and model generalization. Notably, EDCFlow can serve as a plug-and-play refinement module for RAFT-like event-based methods to enhance flow details. Extensive experiments demonstrate that EDCFlow achieves better performance with lower complexity compared to existing methods, offering superior generalization.
Abstract (translated)
最近基于学习的方法用于事件驱动的光流估计时,虽然利用了成本体(cost volumes)来进行像素匹配,但仍然面临着冗余计算和难以扩展到更高分辨率的问题。在这项工作中,我们充分利用了相邻事件帧之间时间密集特征差与成本体之间的互补性,并提出了一种轻量级的基于事件的光流网络(EDCFlow),旨在实现高分辨率下的高质量光流估计。 具体而言,我们开发了一个基于注意力机制的多尺度时间特征差异层,以高效地捕捉不同运动模式在高分辨率下的表现。此外,还进行了一项自适应融合操作,将高分辨率差动运动特征与低分辨率相关运动特征相结合,以此增强对运动的表示和模型泛化能力。 值得注意的是,EDCFlow可以作为插件式的细化模块应用于类似RAFT的方法中,以提高光流细节的质量。通过广泛的实验,我们证明了相较于现有方法,EDCFlow在性能和复杂度之间实现了更好的平衡,并且具有更优越的泛化能力。
URL
https://arxiv.org/abs/2506.03512