Abstract
Objects we encounter often change appearance as we interact with them. Changes in illumination (shadows), object pose, or movement of nonrigid objects can drastically alter available image features. How do biological visual systems track objects as they change? It may involve specific attentional mechanisms for reasoning about the locations of objects independently of their appearances -- a capability that prominent neuroscientific theories have associated with computing through neural synchrony. We computationally test the hypothesis that the implementation of visual attention through neural synchrony underlies the ability of biological visual systems to track objects that change in appearance over time. We first introduce a novel deep learning circuit that can learn to precisely control attention to features separately from their location in the world through neural synchrony: the complex-valued recurrent neural network (CV-RNN). Next, we compare object tracking in humans, the CV-RNN, and other deep neural networks (DNNs), using FeatureTracker: a large-scale challenge that asks observers to track objects as their locations and appearances change in precisely controlled ways. While humans effortlessly solved FeatureTracker, state-of-the-art DNNs did not. In contrast, our CV-RNN behaved similarly to humans on the challenge, providing a computational proof-of-concept for the role of phase synchronization as a neural substrate for tracking appearance-morphing objects as they move about.
Abstract (translated)
我们经常遇到的对象在互动过程中会改变外观。光照变化、物体姿态或运动非刚性对象的改变会导致可用图像特征发生极大改变。生物视觉系统如何跟踪随其变化的对象呢?这可能涉及特定注意力机制来独立于物体外观计算物体位置的推理能力——这一能力与通过神经同步计算的神经科学理论密切相关。我们通过计算视觉注意力通过神经同步实现来测试假设,即视觉注意力通过神经同步实现了生物视觉系统在时间上跟踪随其外观变化的对象的能力。 首先,我们介绍了一个新型的深度学习电路,可以通过神经同步准确地控制对特征的关注度,而无需考虑它们在空间中的位置:复杂值循环神经网络(CV-RNN)。接下来,我们使用FeatureTracker这个大型的挑战来比较人类、CV-RNN和其他深度神经网络(DNNs)的物体跟踪能力。尽管人类轻松地解决了FeatureTracker,但最先进的DNNs没有做到。相反,我们的CV-RNN在挑战中表现出了与人类相似的行为,提供了计算同步作为神经基因为追踪随其运动变化的外貌变形的物体的证明。
URL
https://arxiv.org/abs/2410.02094