Abstract
Visual Odometry (VO) is vital for the navigation of autonomous systems, providing accurate position and orientation estimates at reasonable costs. While traditional VO methods excel in some conditions, they struggle with challenges like variable lighting and motion blur. Deep learning-based VO, though more adaptable, can face generalization problems in new environments. Addressing these drawbacks, this paper presents a novel hybrid visual odometry (VO) framework that leverages pose-only supervision, offering a balanced solution between robustness and the need for extensive labeling. We propose two cost-effective and innovative designs: a self-supervised homographic pre-training for enhancing optical flow learning from pose-only labels and a random patch-based salient point detection strategy for more accurate optical flow patch extraction. These designs eliminate the need for dense optical flow labels for training and significantly improve the generalization capability of the system in diverse and challenging environments. Our pose-only supervised method achieves competitive performance on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods.
Abstract (translated)
视觉里程计(VO)对于自主系统的导航至关重要,它可以在合理的成本下提供准确的定位和方向估计。虽然传统的VO方法在某些情况下表现出色,但它们在多变的光线和运动模糊等情况下遇到了挑战。基于深度学习的VO虽然更具有适应性,但在新的环境中可能会面临泛化问题。为了解决这些缺点,本文提出了一种新颖的混合视觉里程计(VO)框架,该框架利用姿态仅监督,提供了一种平衡的解决方案,即稳健性和大量标注的必要性。我们提出了两种成本效益和创新的设想:自监督同构预训练以增强姿态仅标签的光学流学习,以及基于随机补丁的显着点检测策略,用于更准确的光学流补丁提取。这些设计消除了训练和密集光学流标签的需要,显著提高了系统在多样和具有挑战性的环境中的泛化能力。我们姿态仅监督的方法在标准数据集上实现了与先进密集光学流监督方法的竞争性能,在极端和未见场景中具有更大的鲁棒性和泛化能力,即使与密集光学流监督方法相比也是如此。
URL
https://arxiv.org/abs/2404.04677