PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo Matching

2021-03-12 05:27:14

Hengli Wang, Rui Fan, Peide Cai, Ming Liu

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

Supervised learning with deep convolutional neural networks (DCNNs) has seen huge adoption in stereo matching. However, the acquisition of large-scale datasets with well-labeled ground truth is cumbersome and labor-intensive, making supervised learning-based approaches often hard to implement in practice. To overcome this drawback, we propose a robust and effective self-supervised stereo matching approach, consisting of a pyramid voting module (PVM) and a novel DCNN architecture, referred to as OptStereo. Specifically, our OptStereo first builds multi-scale cost volumes, and then adopts a recurrent unit to iteratively update disparity estimations at high resolution; while our PVM can generate reliable semi-dense disparity images, which can be employed to supervise OptStereo training. Furthermore, we publish the HKUST-Drive dataset, a large-scale synthetic stereo dataset, collected under different illumination and weather conditions for research purposes. Extensive experimental results demonstrate the effectiveness and efficiency of our self-supervised stereo matching approach on the KITTI Stereo benchmarks and our HKUST-Drive dataset. PVStereo, our best-performing implementation, greatly outperforms all other state-of-the-art self-supervised stereo matching approaches. Our project page is available at this http URL.

Abstract (translated)

URL

https://arxiv.org/abs/2103.07094

PDF

https://arxiv.org/pdf/2103.07094.pdf