Paper Reading AI Learner

Attack on Scene Flow using Point Clouds

2024-04-21 11:21:27
Haniyeh Ehsani Oskouie, Mohammad-Shahram Moin, Shohreh Kasaei

Abstract

Deep neural networks have made significant advancements in accurately estimating scene flow using point clouds, which is vital for many applications like video analysis, action recognition, and navigation. Robustness of these techniques, however, remains a concern, particularly in the face of adversarial attacks that have been proven to deceive state-of-the-art deep neural networks in many domains. Surprisingly, the robustness of scene flow networks against such attacks has not been thoroughly investigated. To address this problem, the proposed approach aims to bridge this gap by introducing adversarial white-box attacks specifically tailored for scene flow networks. Experimental results show that the generated adversarial examples obtain up to 33.7 relative degradation in average end-point error on the KITTI and FlyingThings3D datasets. The study also reveals the significant impact that attacks targeting point clouds in only one dimension or color channel have on average end-point error. Analyzing the success and failure of these attacks on the scene flow networks and their 2D optical flow network variants show a higher vulnerability for the optical flow networks.

Abstract (translated)

深度神经网络在准确估计场景流方面取得了显著的进展,这对于许多应用,如视频分析、动作识别和导航至关重要。然而,这些技术的鲁棒性仍然是一个令人担忧的问题,尤其是在面对已知能够欺骗许多领域中最先进的深度神经网络的对抗攻击的情况下。令人惊讶的是,场景流网络对这种攻击的鲁棒性尚未被充分调查。为解决这个问题,所提出的方法旨在通过引入专门针对场景流网络的对抗白盒攻击来弥合这一差距。实验结果表明,生成的对抗样本在KITTI和FlyingThings3D数据集上的平均端点误差最多可降低33.7。研究还揭示了仅针对一维或颜色通道的点云攻击对平均端点误差的影响。通过分析这些攻击在场景流网络和其2D光流网络变体上的成功率和失败情况,表明光流网络具有更高的漏洞。

URL

https://arxiv.org/abs/2404.13621

PDF

https://arxiv.org/pdf/2404.13621.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot