Paper Reading AI Learner

No more optimization rules: LLM-enabled policy-based multi-modal query optimizer

2024-03-20 13:44:30
Yifan Wang, Haodi Ma, Daisy Zhe Wang

Abstract

Large language model (LLM) has marked a pivotal moment in the field of machine learning and deep learning. Recently its capability for query planning has been investigated, including both single-modal and multi-modal queries. However, there is no work on the query optimization capability of LLM. As a critical (or could even be the most important) step that significantly impacts the execution performance of the query plan, such analysis and attempts should not be missed. From another aspect, existing query optimizers are usually rule-based or rule-based + cost-based, i.e., they are dependent on manually created rules to complete the query plan rewrite/transformation. Given the fact that modern optimizers include hundreds to thousands of rules, designing a multi-modal query optimizer following a similar way is significantly time-consuming since we will have to enumerate as many multi-modal optimization rules as possible, which has not been well addressed today. In this paper, we investigate the query optimization ability of LLM and use LLM to design LaPuda, a novel LLM and Policy based multi-modal query optimizer. Instead of enumerating specific and detailed rules, LaPuda only needs a few abstract policies to guide LLM in the optimization, by which much time and human effort are saved. Furthermore, to prevent LLM from making mistakes or negative optimization, we borrow the idea of gradient descent and propose a guided cost descent (GCD) algorithm to perform the optimization, such that the optimization can be kept in the correct direction. In our evaluation, our methods consistently outperform the baselines in most cases. For example, the optimized plans generated by our methods result in 1~3x higher execution speed than those by the baselines.

Abstract (translated)

大语言模型(LLM)在机器学习和深度学习领域标志着了一个重要的里程碑。最近,研究了其查询规划能力,包括单模态和多模态查询。然而,LLM的查询优化能力尚未得到研究。作为对查询计划执行性能的显著影响的关键(甚至可能是最重要的)一步,这种分析和尝试不应该被忽视。从另一个方面说,现有的查询优化器通常是基于规则的或基于规则的+成本基于的,即它们依赖于人工创建的规则来完成查询规划的重新编写/转换。考虑到现代优化器包括数百到数千个规则,根据类似的方法设计一个多模态查询优化器会耗时很长时间,因为我们必须枚举尽可能多的多模态优化规则,而今天这个问题尚未得到很好的解决。在本文中,我们研究了LLM的查询优化能力,并使用LLM设计了一种新颖的LLM和基于策略的多模态查询优化器LaPuda。我们不再枚举具体的和详细的规则,而只是需要几个抽象策略来指导LLM在优化,从而大大节省了时间和人力。此外,为了防止LLM犯错或进行负优化,我们借用了梯度下降的思想,并提出了一种引导成本下降(GCD)算法来进行优化,使得优化可以在正确的方向上进行。在我们的评估中,我们的方法在大多数情况下都超越了基线。例如,通过我们的方法优化后的计划使执行速度提高了1~3倍,而基线则没有变化。

URL

https://arxiv.org/abs/2403.13597

PDF

https://arxiv.org/pdf/2403.13597.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot