Paper Reading AI Learner

SurfelWarp: Efficient Non-Volumetric Single View Dynamic Reconstruction

2019-04-30 06:57:53
Wei Gao, Russ Tedrake

Abstract

We contribute a dense SLAM system that takes a live stream of depth images as input and reconstructs non-rigid deforming scenes in real time, without templates or prior models. In contrast to existing approaches, we do not maintain any volumetric data structures, such as truncated signed distance function (TSDF) fields or deformation fields, which are performance and memory intensive. Our system works with a flat point (surfel) based representation of geometry, which can be directly acquired from commodity depth sensors. Standard graphics pipelines and general purpose GPU (GPGPU) computing are leveraged for all central operations: i.e., nearest neighbor maintenance, non-rigid deformation field estimation and fusion of depth measurements. Our pipeline inherently avoids expensive volumetric operations such as marching cubes, volumetric fusion and dense deformation field update, leading to significantly improved performance. Furthermore, the explicit and flexible surfel based geometry representation enables efficient tackling of topology changes and tracking failures, which makes our reconstructions consistent with updated depth observations. Our system allows robots to maintain a scene description with non-rigidly deformed objects that potentially enables interactions with dynamic working environments.

Abstract (translated)

我们提供了一个密集的SLAM系统,该系统以实时的深度图像流作为输入,并实时重建非刚性变形场景,无需模板或先前的模型。与现有的方法相比,我们不维护任何体积数据结构,例如截断有符号距离函数(tsdf)字段或变形字段,这些都是性能和内存密集型的。我们的系统使用基于平点(surfel)的几何表示,可以直接从商品深度传感器获取。标准图形管道和通用GPU(GPGPU)计算用于所有中央操作:即最近邻维护、非刚性变形场估计和深度测量融合。我们的管道从本质上避免了昂贵的体积操作,例如行进立方体、体积融合和密集变形场更新,从而显著提高了性能。此外,基于Surfel的清晰和灵活的几何表示可以有效地处理拓扑变化和跟踪故障,这使得我们的重建与更新的深度观测一致。我们的系统允许机器人使用非刚性变形的物体来维护场景描述,这些物体有可能与动态工作环境进行交互。

URL

https://arxiv.org/abs/1904.13073

PDF

https://arxiv.org/pdf/1904.13073.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot