Abstract
We introduce MUTE-SLAM, a real-time neural RGB-D SLAM system employing multiple tri-plane hash-encodings for efficient scene representation. MUTE-SLAM effectively tracks camera positions and incrementally builds a scalable multi-map representation for both small and large indoor environments. It dynamically allocates sub-maps for newly observed local regions, enabling constraint-free mapping without prior scene information. Unlike traditional grid-based methods, we use three orthogonal axis-aligned planes for hash-encoding scene properties, significantly reducing hash collisions and the number of trainable parameters. This hybrid approach not only speeds up convergence but also enhances the fidelity of surface reconstruction. Furthermore, our optimization strategy concurrently optimizes all sub-maps intersecting with the current camera frustum, ensuring global consistency. Extensive testing on both real-world and synthetic datasets has shown that MUTE-SLAM delivers state-of-the-art surface reconstruction quality and competitive tracking performance across diverse indoor settings. The code will be made public upon acceptance of the paper.
Abstract (translated)
我们提出了MUTE-SLAM,一种采用多个三平面哈希编码实现实时神经实时SLAM系统,用于对场景进行高效表示。MUTE-SLAM有效地跟踪相机位置,并逐步构建了可扩展的多地图表示,无论是小还是大的室内环境。它动态地分配给新观察到的局部区域的子图,从而在不需要先验场景信息的情况下实现无约束的映射。与传统网格 based 方法不同,我们使用三个正交的轴向对齐平面进行哈希编码,显著减少了哈希冲突和训练参数的数量。这种混合方法不仅加速了收敛,还提高了表面复原的准确度。此外,我们的优化策略同时优化所有与当前相机弗鲁斯面相交的子图,确保全局一致性。在真实世界和合成数据集的广泛测试中,MUTE-SLAM证明了其在各种室内环境中具有最先进的表面复原质量和竞争力的跟踪性能。代码将在论文接受后公开。
URL
https://arxiv.org/abs/2403.17765