Paper Reading AI Learner

Diffusion Large Language Models for Black-Box Optimization

2026-01-20 19:59:29
Ye Yuan (Sam), Can (Sam), Chen, Zipeng Sun, Dinghuai Zhang, Christopher Pal, Xue Liu

Abstract

Offline black-box optimization (BBO) aims to find optimal designs based solely on an offline dataset of designs and their labels. Such scenarios frequently arise in domains like DNA sequence design and robotics, where only a few labeled data points are available. Traditional methods typically rely on task-specific proxy or generative models, overlooking the in-context learning capabilities of pre-trained large language models (LLMs). Recent efforts have adapted autoregressive LLMs to BBO by framing task descriptions and offline datasets as natural language prompts, enabling direct design generation. However, these designs often contain bidirectional dependencies, which left-to-right models struggle to capture. In this paper, we explore diffusion LLMs for BBO, leveraging their bidirectional modeling and iterative refinement capabilities. This motivates our in-context denoising module: we condition the diffusion LLM on the task description and the offline dataset, both formatted in natural language, and prompt it to denoise masked designs into improved candidates. To guide the generation toward high-performing designs, we introduce masked diffusion tree search, which casts the denoising process as a step-wise Monte Carlo Tree Search that dynamically balances exploration and exploitation. Each node represents a partially masked design, each denoising step is an action, and candidates are evaluated via expected improvement under a Gaussian Process trained on the offline dataset. Our method, dLLM, achieves state-of-the-art results in few-shot settings on design-bench.

Abstract (translated)

离线黑盒优化(BBO)旨在仅基于设计的离线数据集及其标签来寻找最优设计方案。这种情况在DNA序列设计和机器人技术等领域能够频繁出现,这些领域中仅有少量带有标注的数据点可用。传统的做法通常依赖于特定任务的代理模型或生成模型,而忽略了预训练大规模语言模型(LLMs)中的上下文学习能力。近期的研究将自回归型LLM应用于BBO,通过以自然语言提示的形式构建任务描述和离线数据集来直接生成设计方案。然而,这些设计往往包含双向依赖关系,这是从左到右的模型难以捕捉到的。在本文中,我们探讨了使用扩散型LLM进行BBO的方法,利用其双向建模能力和迭代细化能力。这促使我们开发了一个上下文去噪模块:我们将任务描述和离线数据集(均以自然语言格式呈现)作为条件来约束扩散型LLM,并提示它将掩码设计去噪为改进后的候选方案。为了引导生成向高性能设计方案靠拢,我们引入了掩码扩散树搜索,这是一种逐步的蒙特卡洛树搜索过程,在动态平衡探索与利用之间进行操作。每个节点代表一个部分被掩码的设计,每次去噪步骤被视为一次行动,并通过基于离线数据集训练的高斯进程下的预期改进来评估候选方案。我们的方法dLLM在设计基准测试中的少量样本设置中取得了最先进的成果。

URL

https://arxiv.org/abs/2601.14446

PDF

https://arxiv.org/pdf/2601.14446.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot