Abstract
We present a self-supervised learning approach for optical flow. Our method distills reliable flow estimations from non-occluded pixels, and uses these predictions as ground truth to learn optical flow for hallucinated occlusions. We further design a simple CNN to utilize temporal information from multiple frames for better flow estimation. These two principles lead to an approach that yields the best performance for unsupervised optical flow learning on the challenging benchmarks including MPI Sintel, KITTI 2012 and 2015. More notably, our self-supervised pre-trained model provides an excellent initialization for supervised fine-tuning. Our fine-tuned models achieve state-of-the-art results on all three datasets. At the time of writing, we achieve EPE=4.26 on the Sintel benchmark, outperforming all submitted methods.
Abstract (translated)
我们提出了一种自监督的光流学习方法。我们的方法提取了可靠的流量估计从非闭塞像素,并使用这些预测作为地面真相,学习光学流的幻觉闭塞。我们进一步设计了一个简单的CNN,利用来自多帧的时间信息来更好地估计流量。这两个原则导致了一种方法,在MPI Sintel、Kitti 2012和2015等具有挑战性的基准上为无监督光流学习提供最佳性能。更值得注意的是,我们的自监督预培训模型为受监督的微调提供了一个极好的初始化。我们的微调模型在所有三个数据集上都取得了最先进的结果。在撰写本文时,我们在Sintel基准上实现了EPE=4.26,优于所有提交的方法。
URL
https://arxiv.org/abs/1904.09117