Abstract
Rigged objects are commonly used in artist pipelines, as they can flexibly adapt to different scenes and postures. However, articulating the rigs into realistic affordance-aware postures (e.g., following the context, respecting the physics and the personalities of the object) remains time-consuming and heavily relies on human labor from experienced artists. In this paper, we tackle the novel problem and design A3Syn. With a given context, such as the environment mesh and a text prompt of the desired posture, A3Syn synthesizes articulation parameters for arbitrary and open-domain rigged objects obtained from the Internet. The task is incredibly challenging due to the lack of training data, and we do not make any topological assumptions about the open-domain rigs. We propose using 2D inpainting diffusion model and several control techniques to synthesize in-context affordance information. Then, we develop an efficient bone correspondence alignment using a combination of differentiable rendering and semantic correspondence. A3Syn has stable convergence, completes in minutes, and synthesizes plausible affordance on different combinations of in-the-wild object rigs and scenes.
Abstract (translated)
配装物体(rigged objects)在艺术家的工作流程中经常被使用,因为它们能够灵活适应不同的场景和姿态。然而,将这些配装物调整成符合现实情景、遵守物理法则并体现对象个性的姿势仍然是一项耗时且高度依赖于有经验艺术家的人工劳动的任务。本文提出解决这一新颖问题的方法,并设计了A3Syn系统。给定一定的上下文信息(如环境网格和所需姿态的文字提示),A3Syn可以为从互联网上获取的任意开放域配装物体合成出相应的关节参数。 这项任务极具挑战性,原因在于缺乏训练数据且我们不对开放域配装物做任何拓扑假设。为此,我们提出利用二维修补扩散模型以及几种控制技术来生成符合上下文的相关信息(affordance)。接着,通过结合可微渲染和语义对应关系的方法开发了一种高效的骨骼对齐机制。 A3Syn能够稳定收敛,并在几分钟内完成任务,在不同的开放域物体配装组合及场景中都能合成出合理的相关性。
URL
https://arxiv.org/abs/2501.12393