Paper Reading AI Learner

GenerationPrograms: Fine-grained Attribution with Executable Programs

2025-06-17 14:37:09
David Wan, Eran Hirsch, Elias Stengel-Eskin, Ido Dagan, Mohit Bansal

Abstract

Recent large language models (LLMs) achieve impressive performance in source-conditioned text generation but often fail to correctly provide fine-grained attributions for their outputs, undermining verifiability and trust. Moreover, existing attribution methods do not explain how and why models leverage the provided source documents to generate their final responses, limiting interpretability. To overcome these challenges, we introduce a modular generation framework, GenerationPrograms, inspired by recent advancements in executable "code agent" architectures. Unlike conventional generation methods that simultaneously generate outputs and attributions or rely on post-hoc attribution, GenerationPrograms decomposes the process into two distinct stages: first, creating an executable program plan composed of modular text operations (such as paraphrasing, compression, and fusion) explicitly tailored to the query, and second, executing these operations following the program's specified instructions to produce the final response. Empirical evaluations demonstrate that GenerationPrograms significantly improves attribution quality at both the document level and sentence level across two long-form question-answering tasks and a multi-document summarization task. We further demonstrate that GenerationPrograms can effectively function as a post-hoc attribution method, outperforming traditional techniques in recovering accurate attributions. In addition, the interpretable programs generated by GenerationPrograms enable localized refinement through modular-level improvements that further enhance overall attribution quality.

Abstract (translated)

最近的大规模语言模型(LLM)在基于源文档的文本生成方面表现出色,但常常无法正确地为其输出提供细粒度的归属说明,这削弱了验证性和信任。此外,现有的归属方法未能解释模型如何以及为何利用提供的源文件来生成最终响应,从而限制了可解释性。为克服这些挑战,我们引入了一种模块化生成框架——GenerationPrograms,该框架借鉴了最近可执行“代码代理”架构的进展。与传统的生成方法同时生成输出和归属或依赖于事后归属不同,GenerationPrograms将过程分解成两个独立阶段:首先创建一个由模板块文本操作(如改写、压缩和融合)组成的可执行程序计划,这些操作明确针对查询进行了定制;其次根据该程序的指定指令执行这些操作以产生最终响应。实证评估表明,在两项长形式问答任务和一项多文档摘要任务中,GenerationPrograms显著提高了文档级别和句子级别的归属质量。此外,我们还展示了GenerationPrograms可以有效充当事后归属方法,超越传统技术准确恢复归属的能力。生成程序的可解释性使通过模块级改进实现局部精炼成为可能,并进一步提升了整体归属质量。

URL

https://arxiv.org/abs/2506.14580

PDF

https://arxiv.org/pdf/2506.14580.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot