Paper Reading AI Learner

CoRA: A Collaborative Robust Architecture with Hybrid Fusion for Efficient Perception

2025-12-15 11:00:38
Gong Chen, Chaokun Zhang, Pengcheng Lv, Xiaohui Xie

Abstract

Collaborative perception has garnered significant attention as a crucial technology to overcome the perceptual limitations of single-agent systems. Many state-of-the-art (SOTA) methods have achieved communication efficiency and high performance via intermediate fusion. However, they share a critical vulnerability: their performance degrades under adverse communication conditions due to the misalignment induced by data transmission, which severely hampers their practical deployment. To bridge this gap, we re-examine different fusion paradigms, and recover that the strengths of intermediate and late fusion are not a trade-off, but a complementary pairing. Based on this key insight, we propose CoRA, a novel collaborative robust architecture with a hybrid approach to decouple performance from robustness with low communication. It is composed of two components: a feature-level fusion branch and an object-level correction branch. Its first branch selects critical features and fuses them efficiently to ensure both performance and scalability. The second branch leverages semantic relevance to correct spatial displacements, guaranteeing resilience against pose errors. Experiments demonstrate the superiority of CoRA. Under extreme scenarios, CoRA improves upon its baseline performance by approximately 19% in AP@0.7 with more than 5x less communication volume, which makes it a promising solution for robust collaborative perception.

Abstract (translated)

协作感知作为克服单个代理系统感知限制的关键技术,已吸引了广泛关注。许多最先进的(SOTA)方法通过中间融合实现了通信效率和高性能的提升。然而,它们都存在一个关键弱点:在不利的通信条件下,由于数据传输引起的不匹配会导致性能下降,这严重阻碍了其实际部署的应用。为了弥合这一差距,我们重新审视了不同的融合范式,并发现中级和后期融合的优势并不是取舍关系,而是相辅相成的搭配方式。基于这个关键见解,我们提出了一种名为CoRA的新颖协作鲁棒架构,采用混合方法将性能与低通信量下的稳健性解耦。该架构由两个部分组成:特征级融合分支和对象级校正分支。 第一部分选择关键特征并高效地将其融合起来,以确保性能和可扩展性。第二部分利用语义相关性来纠正空间位移,从而保证对姿态误差的鲁棒性。实验结果证明了CoRA的优势。在极端场景下,与基线相比,CoRA在AP@0.7指标上提高了约19%,并且通信量减少了超过5倍,使其成为协作感知中具有前景的稳健解决方案。

URL

https://arxiv.org/abs/2512.13191

PDF

https://arxiv.org/pdf/2512.13191.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot