Paper Reading AI Learner

Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

2024-04-16 07:07:40
Jiapeng Su, Qi Fan, Guangming Lu, Fanglin Chen, Wenjie Pei

Abstract

Few-shot semantic segmentation (FSS) has achieved great success on segmenting objects of novel classes, supported by only a few annotated samples. However, existing FSS methods often underperform in the presence of domain shifts, especially when encountering new domain styles that are unseen during training. It is suboptimal to directly adapt or generalize the entire model to new domains in the few-shot scenario. Instead, our key idea is to adapt a small adapter for rectifying diverse target domain styles to the source domain. Consequently, the rectified target domain features can fittingly benefit from the well-optimized source domain segmentation model, which is intently trained on sufficient source domain data. Training domain-rectifying adapter requires sufficiently diverse target domains. We thus propose a novel local-global style perturbation method to simulate diverse potential target domains by perturbating the feature channel statistics of the individual images and collective statistics of the entire source domain, respectively. Additionally, we propose a cyclic domain alignment module to facilitate the adapter effectively rectifying domains using a reverse domain rectification supervision. The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain. During testing on target domains, we start by rectifying the image features and then conduct few-shot segmentation on the domain-rectified features. Extensive experiments demonstrate the effectiveness of our method, achieving promising results on cross-domain few-shot semantic segmentation tasks. Our code is available at this https URL.

Abstract (translated)

少数shot语义分割(FSS)已经在对新颖类别的物体进行分割时取得了巨大的成功,仅依赖于几篇注释样本。然而,在领域漂移存在时,现有的FSS方法通常表现不佳,尤其是在遇到训练中未见过的全新领域样式时。在少数 shot 场景中直接将整个模型适应或泛化到新领域往往是低效的。相反,我们的关键想法是针对每个目标域调整一个小的适配器来修复多样目标域样式。因此,调整后的目标域特征可以有效地利用经过良好优化的源域分割模型,该模型在充分的源域数据上进行训练。训练领域归一化适配器需要足够多样化的目标域。因此,我们提出了一个新颖的局部-全局样式扰动方法,通过扰动单个图像的特征通道统计和整个源域的特征统计来模拟不同的潜在目标域。此外,我们还提出了一个环形域对齐模块,以便于适配器通过反向域矩形化监督有效地修复领域。适配器从多样合成目标域中调整图像特征与源域对齐。在测试目标域时,我们首先对图像特征进行矩形化,然后对领域归一化的特征进行少量 shot 分割。大量实验证明,我们的方法的有效性,在跨领域少量 shot 语义分割任务中取得了很好的结果。我们的代码可在此处访问:https:// this URL.

URL

https://arxiv.org/abs/2404.10322

PDF

https://arxiv.org/pdf/2404.10322.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot