Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials

Abstract
Abstract (translated)
URL
PDF

Abstract

Physically realistic materials are pivotal in augmenting the realism of 3D assets across various applications and lighting conditions. However, existing 3D assets and generative models often lack authentic material properties. Manual assignment of materials using graphic software is a tedious and time-consuming task. In this paper, we exploit advancements in Multimodal Large Language Models (MLLMs), particularly GPT-4V, to present a novel approach, Make-it-Real: 1) We demonstrate that GPT-4V can effectively recognize and describe materials, allowing the construction of a detailed material library. 2) Utilizing a combination of visual cues and hierarchical text prompts, GPT-4V precisely identifies and aligns materials with the corresponding components of 3D objects. 3) The correctly matched materials are then meticulously applied as reference for the new SVBRDF material generation according to the original diffuse map, significantly enhancing their visual authenticity. Make-it-Real offers a streamlined integration into the 3D content creation workflow, showcasing its utility as an essential tool for developers of 3D assets.

Abstract (translated)

物理真实感材料在增强3D资产的各种应用和光照条件下的真实感方面至关重要。然而，现有的3D资产和生成模型通常缺乏真实材料的属性。使用图形软件手动分配材料是一个费力且耗时的任务。在本文中，我们利用多模态大型语言模型（MMLMs）的进步，特别是GPT-4V，提出了一个新方法，名为Make-it-Real：1）我们证明了GPT-4V可以有效地识别和描述材料，使得构建详细材料库成为可能。2）利用视觉提示和分层文本提示，GPT-4V准确地识别和校准材料与3D物体相应部件的对应关系。3）然后，正确匹配的材料被用作根据原始漫射图生成新的SVBRDF材料的新参考，显著增强了它们的视觉真实感。Make-it-Real使3D内容创建工作流程更加流畅，展示了它在开发者3D资产方面作为关键工具的重要作用。

URL

https://arxiv.org/abs/2404.16829

PDF

https://arxiv.org/pdf/2404.16829.pdf

Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials

Abstract

Abstract (translated)

URL

PDF Copy

PDF