Paper Reading AI Learner

Causality-based Dual-Contrastive Learning Framework for Domain Generalization

2023-01-22 13:07:24
Zining Chen, Weiqiu Wang, Zhicheng Zhao, Aidong Men

Abstract

Domain Generalization (DG) is essentially a sub-branch of out-of-distribution generalization, which trains models from multiple source domains and generalizes to unseen target domains. Recently, some domain generalization algorithms have emerged, but most of them were designed with non-transferable complex architecture. Additionally, contrastive learning has become a promising solution for simplicity and efficiency in DG. However, existing contrastive learning neglected domain shifts that caused severe model confusions. In this paper, we propose a Dual-Contrastive Learning (DCL) module on feature and prototype contrast. Moreover, we design a novel Causal Fusion Attention (CFA) module to fuse diverse views of a single image to attain prototype. Furthermore, we introduce a Similarity-based Hard-pair Mining (SHM) strategy to leverage information on diversity shift. Extensive experiments show that our method outperforms state-of-the-art algorithms on three DG datasets. The proposed algorithm can also serve as a plug-and-play module without usage of domain labels.

Abstract (translated)

域扩展(DG)是分布外扩展分支的一个子领域,该方法从多个源域中训练模型,并扩展到未知的目标域。最近,一些域扩展算法已经出现,但大多数设计使用了不可转移的复杂架构。此外,对比学习已成为在域扩展方面的一个有前途的解决方案,以简化和效率为关键。然而,现有的对比学习忽视了导致严重模型混淆的域转换。在本文中,我们提出了基于特征和原型对比的二元对比学习(DCL)模块,并设计了一个独特的因果融合注意力(CFA)模块,将单个图像的多个视角融合成一个原型,同时引入基于相似性的 Hard-pair Mining(SHM)策略,以利用多样性转移的信息。广泛的实验结果表明,我们的方法在三个域扩展数据集上优于当前最先进的算法。该提议算法还可以用作可插拔模块,而不需要域标签。

URL

https://arxiv.org/abs/2301.09120

PDF

https://arxiv.org/pdf/2301.09120.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot