Abstract
Vision-based localization for autonomous driving has been of great interest among researchers. When a pre-built 3D map is not available, the techniques of visual simultaneous localization and mapping (SLAM) are typically adopted. Due to error accumulation, visual SLAM (vSLAM) usually suffers from long-term drift. This paper proposes a framework to increase the localization accuracy by fusing the vSLAM with a deep-learning-based ground-to-satellite (G2S) image registration method. In this framework, a coarse (spatial correlation bound check) to fine (visual odometry consistency check) method is designed to select the valid G2S prediction. The selected prediction is then fused with the SLAM measurement by solving a scaled pose graph problem. To further increase the localization accuracy, we provide an iterative trajectory fusion pipeline. The proposed framework is evaluated on two well-known autonomous driving datasets, and the results demonstrate the accuracy and robustness in terms of vehicle localization.
Abstract (translated)
在自动驾驶中,基于视觉的局部定位一直引起了研究人员的极大兴趣。当预先构建的3D地图不可用时,通常采用视觉同时定位与映射(SLAM)技术。由于误差累积,通常会导致视觉SLAM(vSLAM)长期漂移。本文提出了一种通过融合vSLAM与基于深度学习的地面到卫星(G2S)图像配准方法来提高局部定位精度的框架。在这种框架中,设计了一种粗(空间关联边界检查)到细(视觉观测一致性检查)的方法来选择有效的G2S预测。选择后的预测通过求解标量姿态图问题与SLAM测量进行融合。为了进一步提高局部定位精度,我们提供了一个迭代轨迹融合管道。在两个著名的自动驾驶数据集上进行评估,结果显示在车辆定位方面具有准确性和稳健性。
URL
https://arxiv.org/abs/2404.09169