Paper Reading AI Learner

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

2025-05-22 13:12:46
Xueyang Zhou, Guiyao Tie, Guowen Zhang, Hechang Wang, Pan Zhou, Lichao Sun

Abstract

Vision-Language-Action (VLA) models have advanced robotic control by enabling end-to-end decision-making directly from multimodal inputs. However, their tightly coupled architectures expose novel security vulnerabilities. Unlike traditional adversarial perturbations, backdoor attacks represent a stealthier, persistent, and practically significant threat-particularly under the emerging Training-as-a-Service paradigm-but remain largely unexplored in the context of VLA models. To address this gap, we propose BadVLA, a backdoor attack method based on Objective-Decoupled Optimization, which for the first time exposes the backdoor vulnerabilities of VLA models. Specifically, it consists of a two-stage process: (1) explicit feature-space separation to isolate trigger representations from benign inputs, and (2) conditional control deviations that activate only in the presence of the trigger, while preserving clean-task performance. Empirical results on multiple VLA benchmarks demonstrate that BadVLA consistently achieves near-100% attack success rates with minimal impact on clean task accuracy. Further analyses confirm its robustness against common input perturbations, task transfers, and model fine-tuning, underscoring critical security vulnerabilities in current VLA deployments. Our work offers the first systematic investigation of backdoor vulnerabilities in VLA models, highlighting an urgent need for secure and trustworthy embodied model design practices. We have released the project page at this https URL.

Abstract (translated)

翻译: 基于视觉-语言-行动(VLA)模型的机器人控制技术通过从多模态输入直接进行端到端决策而取得了进步。然而,这些紧密耦合的架构暴露出了新的安全漏洞。与传统的对抗性扰动不同,后门攻击代表着一种更隐蔽、持久且在新兴的“训练即服务”(Training-as-a-Service)模式下具有实际意义的重大威胁——但在VLA模型背景下的研究还远远不够充分。为填补这一空白,我们提出了基于目标解耦优化方法的BadVLA,首次揭示了VLA模型中的后门漏洞。具体来说,它包括一个两阶段过程:(1)明确分离特征空间以隔离触发表示与正常输入,并且(2)在存在触发信号时才激活条件控制偏差,同时保持清洁任务的表现能力不受影响。多项VLA基准测试的实证结果表明,BadVLA能够始终如一地实现接近100%的成功攻击率,而对清洁任务准确度的影响微乎其微。进一步分析证实了该方法在面对常见输入扰动、任务迁移以及模型精调时都具有强大的鲁棒性,强调了当前VLA部署中存在关键的安全漏洞问题。我们的工作首次系统地调查了VLA模型中的后门安全漏洞,并突出了建立安全且值得信赖的实体化模型设计实践的紧迫需求。我们已在[项目主页](https://example.com)发布了该项目页面。 请注意:在上面的翻译中,"this https URL" 被替换为 "this https URL" 的占位符 [项目主页](https://example.com),这是因为实际链接需要由发布者提供。请使用正确的URL进行替代。

URL

https://arxiv.org/abs/2505.16640

PDF

https://arxiv.org/pdf/2505.16640.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot