Abstract
Reconstructing 3D scenes from sparse viewpoints is a long-standing challenge with wide applications. Recent advances in feed-forward 3D Gaussian sparse-view reconstruction methods provide an efficient solution for real-time novel view synthesis by leveraging geometric priors learned from large-scale multi-view datasets and computing 3D Gaussian centers via back-projection. Despite offering strong geometric cues, both feed-forward multi-view depth estimation and flow-depth joint estimation face key limitations: the former suffers from mislocation and artifact issues in low-texture or repetitive regions, while the latter is prone to local noise and global inconsistency due to unreliable matches when ground-truth flow supervision is unavailable. To overcome this, we propose JointSplat, a unified framework that leverages the complementarity between optical flow and depth via a novel probabilistic optimization mechanism. Specifically, this pixel-level mechanism scales the information fusion between depth and flow based on the matching probability of optical flow during training. Building upon the above mechanism, we further propose a novel multi-view depth-consistency loss to leverage the reliability of supervision while suppressing misleading gradients in uncertain areas. Evaluated on RealEstate10K and ACID, JointSplat consistently outperforms state-of-the-art (SOTA) methods, demonstrating the effectiveness and robustness of our proposed probabilistic joint flow-depth optimization approach for high-fidelity sparse-view 3D reconstruction.
Abstract (translated)
从稀疏视角重建三维场景是一个具有广泛应用的长期挑战。近期,基于前馈的三维高斯稀疏视图重建方法取得了进展,通过利用大规模多视图数据集中学习到的几何先验,并计算通过反向投影获得的三维高斯中心,提供了一种实时新视角合成的有效解决方案。尽管这些方法提供了强大的几何线索,但前馈多视图深度估计和光流-深度联合估计都面临关键限制:前者在低纹理或重复区域容易出现定位错误和伪影问题;后者由于缺乏地面实况光流监督而导致不准确的匹配时,则会受到局部噪声和全局一致性差的影响。为克服这些问题,我们提出了JointSplat,这是一个统一框架,通过新颖的概率优化机制利用了光流与深度之间的互补性。具体来说,这种像素级别的机制根据训练过程中的光流匹配概率来调整深度和光流的信息融合规模。基于上述机制,我们进一步提出了一种新的多视图深度一致性损失方法,以利用监督的可靠性并抑制不确定区域中误导性的梯度影响。 在RealEstate10K和ACID数据集上的评估表明,JointSplat始终优于最先进的(SOTA)方法,这证明了我们的概率联合光流-深度优化方法对于高保真稀疏视图三维重建的有效性和鲁棒性。
URL
https://arxiv.org/abs/2506.03872