Abstract
Recently, the generation of dynamic 3D objects from a video has shown impressive results. Existing methods directly optimize Gaussians using whole information in frames. However, when dynamic regions are interwoven with static regions within frames, particularly if the static regions account for a large proportion, existing methods often overlook information in dynamic regions and are prone to overfitting on static regions. This leads to producing results with blurry textures. We consider that decoupling dynamic-static features to enhance dynamic representations can alleviate this issue. Thus, we propose a dynamic-static feature decoupling module (DSFD). Along temporal axes, it regards the portions of current frame features that possess significant differences relative to reference frame features as dynamic features. Conversely, the remaining parts are the static features. Then, we acquire decoupled features driven by dynamic features and current frame features. Moreover, to further enhance the dynamic representation of decoupled features from different viewpoints and ensure accurate motion prediction, we design a temporal-spatial similarity fusion module (TSSF). Along spatial axes, it adaptively selects a similar information of dynamic regions. Hinging on the above, we construct a novel approach, DS4D. Experimental results verify our method achieves state-of-the-art (SOTA) results in video-to-4D. In addition, the experiments on a real-world scenario dataset demonstrate its effectiveness on the 4D scene. Our code will be publicly available.
Abstract (translated)
最近,从视频生成动态3D对象取得了令人印象深刻的结果。现有方法直接使用帧中所有信息来优化高斯分布。然而,当动态区域与静态区域交织在一起,特别是如果静态区域占较大比例时,现有的方法往往忽视了动态区域中的信息,并且容易在静态区域过度拟合。这导致生成结果出现模糊纹理的问题。我们认为,分离动态和静态特征以增强动态表示可以缓解这一问题。因此,我们提出了一个动态-静态特征解耦模块(DSFD)。沿时间轴,它将当前帧特征中相对于参考帧特征具有显著差异的部分视为动态特征;而其余部分则被视为静态特征。随后,我们根据动态特征与当前帧特征获取分离的特征。此外,为了进一步增强从不同视角获得的解耦特征中的动态表示,并确保准确的动作预测,我们设计了一个时空相似性融合模块(TSSF)。沿空间轴,它自适应地选择动态区域的类似信息。基于上述方法,我们构建了一种新的方法DS4D。实验结果验证了我们的方法在视频到4D转换中取得了最先进的(SOTA)成果。此外,在一个真实场景数据集上的实验表明其在4D场景中的有效性。我们将公开发布代码。
URL
https://arxiv.org/abs/2502.08377