Paper Reading AI Learner

AIR: Post-training Data Selection for Reasoning via Attention Head Influence

2025-12-15 12:38:24
Jinrui Liu, Jeff Wu, Xuanguang Pan, Gavin Cheung, Shuai Ma, Chongyang Tao

Abstract

LLMs achieve remarkable multi-step reasoning capabilities, yet effectively transferring these skills via post-training distillation remains challenging. Existing data selection methods, ranging from manual curation to heuristics based on length, entropy, or overall loss, fail to capture the causal importance of individual reasoning steps, limiting distillation efficiency. To address this, we propose Attention Influence for Reasoning (AIR), a principled, unsupervised and training-free framework that leverages mechanistic insights of the retrieval head to select high-value post-training data. AIR first identifies reasoning-critical attention heads of an off-the-shelf model, then constructs a weakened reference model with disabled head influence, and finally quantifies the resulting loss divergence as the Attention Influence Score. This score enables fine-grained assessment at both the step and sample levels, supporting step-level weighted fine-tuning and global sample selection. Experiments across multiple reasoning benchmarks show that AIR consistently improves reasoning accuracy, surpassing heuristic baselines and effectively isolating the most critical steps and samples. Our work establishes a mechanism-driven, data-efficient approach for reasoning distillation in LLMs.

Abstract (translated)

大型语言模型(LLMs)实现了显著的多步推理能力,但通过后期训练蒸馏有效转移这些技能仍然具有挑战性。现有的数据选择方法从手动策划到基于长度、熵或整体损失的启发式方法,都无法捕捉个体推理步骤的因果重要性,从而限制了蒸馏效率。为了解决这一问题,我们提出了用于推理的注意力影响(AIR),这是一个原理驱动的、无监督且无需训练的框架,它利用检索头的机制洞察来选择具有高价值的后期训练数据。 AIR 首先识别现成模型中的关键推理注意头,然后构建一个禁用头部影响的弱化参考模型,并最终量化由此产生的损失分歧作为注意力影响得分。此分数支持在步骤和样本层面进行细粒度评估,支持按步骤加权微调和全局样本选择。 跨多个推理基准实验表明,AIR 一致地提高了推理准确性,超越了启发式基线并有效隔离了最关键的步骤和样本。我们的工作为 LLM 中的推理蒸馏建立了一种机制驱动且数据高效的方法。

URL

https://arxiv.org/abs/2512.13279

PDF

https://arxiv.org/pdf/2512.13279.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot