Abstract
Contrastive self-supervised learning based on point-wise comparisons has been widely studied for vision tasks. In the visual cortex of the brain, neuronal responses to distinct stimulus classes are organized into geometric structures known as neural manifolds. Accurate classification of stimuli can be achieved by effectively separating these manifolds, akin to solving a packing problem. We introduce Contrastive Learning As Manifold Packing (CLAMP), a self-supervised framework that recasts representation learning as a manifold packing problem. CLAMP introduces a loss function inspired by the potential energy of short-range repulsive particle systems, such as those encountered in the physics of simple liquids and jammed packings. In this framework, each class consists of sub-manifolds embedding multiple augmented views of a single image. The sizes and positions of the sub-manifolds are dynamically optimized by following the gradient of a packing loss. This approach yields interpretable dynamics in the embedding space that parallel jamming physics, and introduces geometrically meaningful hyperparameters within the loss function. Under the standard linear evaluation protocol, which freezes the backbone and trains only a linear classifier, CLAMP achieves competitive performance with state-of-the-art self-supervised models. Furthermore, our analysis reveals that neural manifolds corresponding to different categories emerge naturally and are effectively separated in the learned representation space, highlighting the potential of CLAMP to bridge insights from physics, neural science, and machine learning.
Abstract (translated)
基于点对点比较的对比自监督学习方法已被广泛研究用于视觉任务。在大脑的视觉皮层中,不同刺激类别引起的神经元反应被组织成称为神经流形(neural manifolds)的几何结构。通过有效地分离这些流形可以实现准确地分类刺激,类似于解决包装问题的过程。我们引入了一种新的自监督框架——对比学习作为流形打包(CLAMP),它将表示学习重新定义为一个流形打包问题。 CLAMP 引入了一个损失函数,灵感来源于短程排斥粒子系统的势能,例如在简单液体和拥挤包装中的物理现象。在这个框架中,每个类别由包含单个图像的各种增强视图的子流形组成。子流形的大小和位置通过遵循打包损失的梯度动态优化。这种方法产生了与颗粒物质的锁紧物理学相平行的、可解释的动力学,并在损失函数内引入了几何上有意义的超参数。 在标准线性评估协议下,该协议冻结主干网络并仅训练一个线性分类器,在这种情况下,CLAMP 达到了与最先进的自监督模型相当的性能。此外,我们的分析表明,在学习表示空间中,不同类别的神经流形自然地出现并且有效分离,这突显了 CLAMP 将物理学、神经科学和机器学习领域见解结合在一起的潜力。
URL
https://arxiv.org/abs/2506.13717