Abstract
In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets, while emphasizing the distinct categorization of the target data. To facilitate this, we propose an entropy-driven adversarial learning strategy that accounts for the distance distributions of target samples relative to source-domain class prototypes. Parallelly, the discriminative nature of the shared space is upheld through a fusion of three metric learning objectives. In the source domain, our focus is on refining the proximity between samples and their affiliated class prototypes, while in the target domain, we integrate a neighborhood-centric contrastive learning mechanism, enriched with an adept neighborsmining approach. To further accentuate the nuanced feature interrelation among semantically aligned images, we champion the concept of conditional image inpainting, underscoring the premise that semantically analogous images prove more efficacious to the task than their disjointed counterparts. Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present.
Abstract (translated)
在通用类别发现(GCD)中,我们通过利用已知类别的训练数据对已知和未知的类别进行聚类。由于这些数据之间的领域转移,一个显著的挑战出现了。为了应对这个问题,我们提出了一个新场景:跨领域通用类别发现(AD-GCD)和类域发现器跨领域(CDAD-NET)作为解决方法。CDAD-NET旨在同步已知类别的样本在已知类别的(源)和未知的(目标)数据集中的潜在样本,同时强调目标数据的独特分类。为了促进这一目标,我们提出了一个熵驱动的对抗学习策略,考虑了目标样本与源领域类别原型之间的距离分布。同时,通过融合三个度量学习目标来维持共享空间判别性的特征。在源领域,我们的关注点是改进样本与相关类别原型之间的接近程度,而在目标领域,我们引入了一种以邻域为中心的对比学习机制,并支持一个智能邻居挖掘方法。为了进一步强调语义对齐图像之间细微特征之间的关联,我们倡导条件图像修复这一概念,强调语义类似于图像证明比它们的离散对应物更有效地完成任务。在实验中,CDAD-NET在三个AD-GCD基准测试上的性能提高了8-15%。
URL
https://arxiv.org/abs/2404.05366