Paper Reading AI Learner

Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning

2024-03-01 14:44:41
Yixiong Zou, Yicong Liu, Yiman Hu, Yuhua Li, Ruixuan Li

Abstract

Cross-domain few-shot learning (CDFSL) aims to acquire knowledge from limited training data in the target domain by leveraging prior knowledge transferred from source domains with abundant training samples. CDFSL faces challenges in transferring knowledge across dissimilar domains and fine-tuning models with limited training data. To address these challenges, we initially extend the analysis of loss landscapes from the parameter space to the representation space, which allows us to simultaneously interpret the transferring and fine-tuning difficulties of CDFSL models. We observe that sharp minima in the loss landscapes of the representation space result in representations that are hard to transfer and fine-tune. Moreover, existing flatness-based methods have limited generalization ability due to their short-range flatness. To enhance the transferability and facilitate fine-tuning, we introduce a simple yet effective approach to achieve long-range flattening of the minima in the loss landscape. This approach considers representations that are differently normalized as minima in the loss landscape and flattens the high-loss region in the middle by randomly sampling interpolated representations. We implement this method as a new normalization layer that replaces the original one in both CNNs and ViTs. This layer is simple and lightweight, introducing only a minimal number of additional parameters. Experimental results on 8 datasets demonstrate that our approach outperforms state-of-the-art methods in terms of average accuracy. Moreover, our method achieves performance improvements of up to 9\% compared to the current best approaches on individual datasets. Our code will be released.

Abstract (translated)

Cross-domain few-shot learning(CDFSL)旨在通过利用来自丰富训练样本的目标领域先前知识,在有限训练数据中获取目标领域的知识。然而,CDFSL在跨不同领域传递知识和对有限训练数据进行微调时面临挑战。为了应对这些挑战,我们首先将参数空间的分析扩展到表示空间,这允许我们同时解释CDFSL模型的转移和微调难度。我们观察到,表示空间损失函数的尖点会导致难以转移和微调的表示。此外,由于其短程 flatness,现有的基于 flatness 的方法在泛化能力方面有限。为了提高可转移性和促进微调,我们引入了一种简单而有效的途径来实现损失函数中尖点的 long-range flatening。这种方法将表示空间中的不同归一化的最小值视为损失函数中的尖点,并通过随机采样插值表示来平滑中间的高损失区域。我们在CNN和ViT中实现此方法的新归一化层。这层既简单又轻便,仅引入了少量的额外参数。在8个数据集上的实验结果表明,我们的方法在平均精度上超过了最先进的治疗方法。此外,与当前最佳方法相比,我们的方法在单个数据集上的性能改进最多可达到9%。我们的代码将发布。

URL

https://arxiv.org/abs/2403.00567

PDF

https://arxiv.org/pdf/2403.00567.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot