Paper Reading AI Learner

CallNavi: A Study and Challenge on Function Calling Routing and Invocation in Large Language Models

2025-01-09 14:12:43
Yewei Song, Cedric Lothritz, Xunzhu Tang, Saad Ezzini, Jacques Klein, Tegawend\'e F. Bissyand\'e, Andrey Boytsov, Ulrick Ble, Anne Goujon

Abstract

Interacting with a software system via a chatbot can be challenging, especially when the chatbot needs to generate API calls, in the right order and with the right parameters, to communicate with the system. API calling in chatbot systems poses significant challenges, particularly in complex, multi-step tasks requiring accurate API selection and execution. We contribute to this domain in three ways: first, by introducing a novel dataset designed to assess models on API function selection, parameter generation, and nested API calls; second, by benchmarking state-of-the-art language models across varying levels of complexity to evaluate their performance in API function generation and parameter accuracy; and third, by proposing an enhanced API routing method that combines general-purpose large language models for API selection with fine-tuned models for parameter generation and some prompt engineering approach. These approaches lead to substantial improvements in handling complex API tasks, offering practical advancements for real-world API-driven chatbot systems.

Abstract (translated)

通过聊天机器人与软件系统交互可能会很有挑战性,尤其是在需要生成正确的API调用序列和参数以与系统通信时。在复杂的多步骤任务中,特别是在选择准确的API并执行它们方面,聊天系统中的API调用面临重大挑战。我们从三个方面对此领域做出了贡献:首先,通过引入一个新颖的数据集来评估模型在API功能选择、参数生成以及嵌套API调用方面的表现;其次,通过对最先进的语言模型进行基准测试,在不同复杂度级别上评估其在API函数生成和参数准确性方面的能力;第三,提出了一种增强的API路由方法,该方法结合了通用大型语言模型用于API选择,并使用微调后的模型进行参数生成以及一些提示工程方法。这些方法显著提高了处理复杂API任务的能力,为实际中的基于API驱动聊天机器人系统提供了实用的进步。

URL

https://arxiv.org/abs/2501.05255

PDF

https://arxiv.org/pdf/2501.05255.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot