Abstract
Precise identification of individual cows is a fundamental prerequisite for comprehensive digital management in smart livestock farming. While existing animal identification methods excel in controlled, single-camera settings, they face severe challenges regarding cross-camera generalization. When models trained on source cameras are deployed to new monitoring nodes characterized by divergent illumination, backgrounds, viewpoints, and heterogeneous imaging properties, recognition performance often degrades dramatically. This limits the large-scale application of non-contact technologies in dynamic, real-world farming environments. To address this challenge, this study proposes a cross-camera cow identification framework based on disentangled representation learning. This framework leverages the Subspace Identifiability Guarantee (SIG) theory in the context of bovine visual recognition. By modeling the underlying physical data generation process, we designed a principle-driven feature disentanglement module that decomposes observed images into multiple orthogonal latent subspaces. This mechanism effectively isolates stable, identity-related biometric features that remain invariant across cameras, thereby substantially improving generalization to unseen cameras. We constructed a high-quality dataset spanning five distinct camera nodes, covering heterogeneous acquisition devices and complex variations in lighting and angles. Extensive experiments across seven cross-camera tasks demonstrate that the proposed method achieves an average accuracy of 86.0%, significantly outperforming the Source-only Baseline (51.9%) and the strongest cross-camera baseline method (79.8%). This work establishes a subspace-theoretic feature disentanglement framework for collaborative cross-camera cow identification, offering a new paradigm for precise animal monitoring in uncontrolled smart farming environments.
Abstract (translated)
精准识别个体奶牛是智能畜牧管理中全面数字化管理的基本前提。尽管现有的动物识别方法在单个摄像头的受控环境中表现出色,但它们在跨摄像头泛化方面面临着严重挑战。当在源摄像头上训练好的模型部署到具有不同照明条件、背景、视角和异质成像特性的新监控节点时,识别性能通常会显著下降。这限制了非接触技术在动态现实农场环境中的大规模应用。 为了解决这一问题,本研究提出了一种基于解耦表示学习的跨摄像头奶牛识别框架。该框架利用子空间可识别性保证(SIG)理论来解决反刍动物视觉识别中的挑战。通过建模底层物理数据生成过程,我们设计了一个以原理驱动的功能解耦模块,将观察到的图像分解为多个正交潜在子空间。这一机制有效地隔离了跨摄像头不变的身份相关生物特征,从而显著提高了对未见过摄像头的泛化能力。 为此,我们构建了一个高质量的数据集,涵盖了五个不同的摄像机节点,包括异构采集设备和光照以及角度复杂变化的情况。在七项跨摄像头任务上的广泛实验表明,所提出的方法实现了平均准确率为86.0%,远超源相机基线(51.9%)及最强的跨相机基准方法(79.8%)。这项工作建立了一个基于子空间理论的功能解耦框架,为协作式跨摄像头奶牛识别提供了新的范例,并且为在不受控智能农场环境中进行精确动物监测开辟了道路。
URL
https://arxiv.org/abs/2602.07566