Paper Reading AI Learner

Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility

2025-05-27 16:06:04
Yidi Li, Jun Xiao, Zhengda Lu, Yiqun Wang, Haiyong Jiang

Abstract

This work presents a novel text-to-vector graphics generation approach, Dream3DVG, allowing for arbitrary viewpoint viewing, progressive detail optimization, and view-dependent occlusion awareness. Our approach is a dual-branch optimization framework, consisting of an auxiliary 3D Gaussian Splatting optimization branch and a 3D vector graphics optimization branch. The introduced 3DGS branch can bridge the domain gaps between text prompts and vector graphics with more consistent guidance. Moreover, 3DGS allows for progressive detail control by scheduling classifier-free guidance, facilitating guiding vector graphics with coarse shapes at the initial stages and finer details at later stages. We also improve the view-dependent occlusions by devising a visibility-awareness rendering module. Extensive results on 3D sketches and 3D iconographies, demonstrate the superiority of the method on different abstraction levels of details, cross-view consistency, and occlusion-aware stroke culling.

Abstract (translated)

这项工作提出了一种新颖的文本到矢量图形生成方法,名为Dream3DVG,该方法允许任意视角查看、渐进式细节优化以及视图依赖性遮挡感知。我们的方法是一个双分支优化框架,包含一个辅助的3D高斯点置射优化学派和一个3D矢量图形优化学派。引入的3DGS分支可以弥合文本提示与矢量图形之间的领域差距,并提供更一致的指导。此外,3DGS通过调度无分类器引导,允许渐进式细节控制,在初始阶段用粗略形状进行矢量图形引导,在后续阶段添加更多细节。我们还改进了视图依赖性遮挡问题,设计了一个可见性感知渲染模块。在3D草图和3D图标上的大量实验结果表明,该方法在不同抽象层次的细节、跨视角一致性以及基于视图遮挡的笔触剔除方面具有明显优势。

URL

https://arxiv.org/abs/2505.21377

PDF

https://arxiv.org/pdf/2505.21377.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot