Paper Reading AI Learner

CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs

2024-07-11 17:07:32
Leah Chong, Jude Rayan, Steven Dow, Ioanna Lykourentzou, Faez Ahmed

Abstract

Text-to-image generative models have increasingly been used to assist designers during concept generation in various creative domains, such as graphic design, user interface design, and fashion design. However, their applications in engineering design remain limited due to the models' challenges in generating images of feasible designs concepts. To address this issue, this paper introduces a method that improves the design feasibility by prompting the generation with feasible CAD images. In this work, the usefulness of this method is investigated through a case study with a bike design task using an off-the-shelf text-to-image model, Stable Diffusion 2.1. A diverse set of bike designs are produced in seven different generation settings with varying CAD image prompting weights, and these designs are evaluated on their perceived feasibility and novelty. Results demonstrate that the CAD image prompting successfully helps text-to-image models like Stable Diffusion 2.1 create visibly more feasible design images. While a general tradeoff is observed between feasibility and novelty, when the prompting weight is kept low around 0.35, the design feasibility is significantly improved while its novelty remains on par with those generated by text prompts alone. The insights from this case study offer some guidelines for selecting the appropriate CAD image prompting weight for different stages of the engineering design process. When utilized effectively, our CAD image prompting method opens doors to a wider range of applications of text-to-image models in engineering design.

Abstract (translated)

文本到图像生成模型在各个创意领域,如平面设计、用户界面设计和时尚设计,都逐渐被用于协助设计师进行创意构思。然而,在工程设计领域,这些模型的应用仍然受到生成可行设计概念的限制。为解决这个问题,本文介绍了一种通过提示生成具有可行CAD图像的方法来提高设计可行性的方法。在这项研究中,通过一个使用备用文本到图像模型的自行车设计任务,对这种方法的效果进行了调查。在七个不同的生成设置中,通过改变提示CAD图像的重量,生产了各种不同的自行车设计。这些设计在视觉上进行了评价,包括其实际可行性和新颖性。结果表明,CAD图像提示能够显著地提高诸如Stable Diffusion 2.1这样的文本到图像模型的设计可行性。虽然可行性和新颖性之间存在一个普遍的权衡,但当提示权重保持在0.35左右时,设计可行性显著提高,而新颖性仍然与仅使用文本提示生成的设计保持同步。这个案例研究的结果提供了一些关于选择不同阶段工程设计过程中适当的CAD图像提示重量的指导建议。当这些模型得到有效利用时,我们的CAD图像提示方法为工程设计文本到图像模型的应用打开了更广泛的应用空间。

URL

https://arxiv.org/abs/2407.08675

PDF

https://arxiv.org/pdf/2407.08675.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot