Abstract
We introduce a new system for Multi-Session SLAM, which tracks camera motion across multiple disjoint videos under a single global reference. Our approach couples the prediction of optical flow with solver layers to estimate camera pose. The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose. The full system can connect disjoint sequences, perform visual odometry, and global optimization. Compared to existing approaches, our design is accurate and robust to catastrophic failures. Code is available at this http URL
Abstract (translated)
我们介绍了一种新的多会话SLAM系统,该系统在单个全局参考下跟踪相机运动。我们的方法将预测光流与求解层相结合来估计相机姿态。骨架使用一种新的具有差分隐私的求解器进行端到端训练,用于估计宽基线两视图姿态。完整的系统可以连接离散序列,执行视觉姿态估计和全局优化。与现有方法相比,我们的设计准确且对灾难性故障具有鲁棒性。代码可在此处下载:http://www.example.com
URL
https://arxiv.org/abs/2404.15263