Paper Reading AI Learner

Cross-Camera Cow Identification via Disentangled Representation Learning

2026-02-07 14:23:35
Runcheng Wang, Yaru Chen, Guiguo Zhang, Honghua Jiang, Yongliang Qiao

Abstract

Precise identification of individual cows is a fundamental prerequisite for comprehensive digital management in smart livestock farming. While existing animal identification methods excel in controlled, single-camera settings, they face severe challenges regarding cross-camera generalization. When models trained on source cameras are deployed to new monitoring nodes characterized by divergent illumination, backgrounds, viewpoints, and heterogeneous imaging properties, recognition performance often degrades dramatically. This limits the large-scale application of non-contact technologies in dynamic, real-world farming environments. To address this challenge, this study proposes a cross-camera cow identification framework based on disentangled representation learning. This framework leverages the Subspace Identifiability Guarantee (SIG) theory in the context of bovine visual recognition. By modeling the underlying physical data generation process, we designed a principle-driven feature disentanglement module that decomposes observed images into multiple orthogonal latent subspaces. This mechanism effectively isolates stable, identity-related biometric features that remain invariant across cameras, thereby substantially improving generalization to unseen cameras. We constructed a high-quality dataset spanning five distinct camera nodes, covering heterogeneous acquisition devices and complex variations in lighting and angles. Extensive experiments across seven cross-camera tasks demonstrate that the proposed method achieves an average accuracy of 86.0%, significantly outperforming the Source-only Baseline (51.9%) and the strongest cross-camera baseline method (79.8%). This work establishes a subspace-theoretic feature disentanglement framework for collaborative cross-camera cow identification, offering a new paradigm for precise animal monitoring in uncontrolled smart farming environments.

Abstract (translated)

精准识别个体奶牛是智能畜牧管理中全面数字化管理的基本前提。尽管现有的动物识别方法在单个摄像头的受控环境中表现出色,但它们在跨摄像头泛化方面面临着严重挑战。当在源摄像头上训练好的模型部署到具有不同照明条件、背景、视角和异质成像特性的新监控节点时,识别性能通常会显著下降。这限制了非接触技术在动态现实农场环境中的大规模应用。 为了解决这一问题,本研究提出了一种基于解耦表示学习的跨摄像头奶牛识别框架。该框架利用子空间可识别性保证(SIG)理论来解决反刍动物视觉识别中的挑战。通过建模底层物理数据生成过程,我们设计了一个以原理驱动的功能解耦模块,将观察到的图像分解为多个正交潜在子空间。这一机制有效地隔离了跨摄像头不变的身份相关生物特征,从而显著提高了对未见过摄像头的泛化能力。 为此,我们构建了一个高质量的数据集,涵盖了五个不同的摄像机节点,包括异构采集设备和光照以及角度复杂变化的情况。在七项跨摄像头任务上的广泛实验表明,所提出的方法实现了平均准确率为86.0%,远超源相机基线(51.9%)及最强的跨相机基准方法(79.8%)。这项工作建立了一个基于子空间理论的功能解耦框架,为协作式跨摄像头奶牛识别提供了新的范例,并且为在不受控智能农场环境中进行精确动物监测开辟了道路。

URL

https://arxiv.org/abs/2602.07566

PDF

https://arxiv.org/pdf/2602.07566.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot