Paper Reading AI Learner

Behavior and Representation in Large Language Models for Combinatorial Optimization: From Feature Extraction to Algorithm Selection

2025-12-15 14:28:35
Francesca Da Ros, Luca Di Gaspero, Kevin Roitero

Abstract

Recent advances in Large Language Models (LLMs) have opened new perspectives for automation in optimization. While several studies have explored how LLMs can generate or solve optimization models, far less is understood about what these models actually learn regarding problem structure or algorithmic behavior. This study investigates how LLMs internally represent combinatorial optimization problems and whether such representations can support downstream decision tasks. We adopt a twofold methodology combining direct querying, which assesses LLM capacity to explicitly extract instance features, with probing analyses that examine whether such information is implicitly encoded within their hidden layers. The probing framework is further extended to a per-instance algorithm selection task, evaluating whether LLM-derived representations can predict the best-performing solver. Experiments span four benchmark problems and three instance representations. Results show that LLMs exhibit moderate ability to recover feature information from problem instances, either through direct querying or probing. Notably, the predictive power of LLM hidden-layer representations proves comparable to that achieved through traditional feature extraction, suggesting that LLMs capture meaningful structural information relevant to optimization performance.

Abstract (translated)

最近在大型语言模型(LLM)方面取得的进展为优化领域的自动化开辟了新的视角。尽管已有若干研究探讨了LLM生成或解决优化模型的能力,但对于这些模型实际学习到的问题结构或算法行为知之甚少。本研究旨在探究LLM如何内部表示组合优化问题,并评估这种表示是否能够支持下游决策任务。我们采用了一种双重方法论,结合直接查询和探测分析:前者用于评估LLM从实例中显式提取特征的能力;后者则考察这些信息是否隐含编码在其隐藏层中。进一步地,我们将探测框架扩展到了逐例算法选择任务上,以评估由LLM生成的表示能否预测最佳性能求解器。实验涵盖了四个基准问题和三种实例表示形式。结果表明,LLM表现出通过直接查询或探测分析从问题实例恢复特征信息的适度能力。值得注意的是,LLM隐藏层表示的预测力与传统特征提取方法所获得的效果相当,这表明LLM确实捕捉到了对优化性能具有重要意义的结构化信息。

URL

https://arxiv.org/abs/2512.13374

PDF

https://arxiv.org/pdf/2512.13374.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot