Paper Reading AI Learner

End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans

2019-06-10 18:01:42
Armen Avetisyan, Angela Dai, Matthias Nießner

Abstract

We present a novel, end-to-end approach to align CAD models to an 3D scan of a scene, enabling transformation of a noisy, incomplete 3D scan to a compact, CAD reconstruction with clean, complete object geometry. Our main contribution lies in formulating a differentiable Procrustes alignment that is paired with a symmetry-aware dense object correspondence prediction. To simultaneously align CAD models to all the objects of a scanned scene, our approach detects object locations, then predicts symmetry-aware dense object correspondences between scan and CAD geometry in a unified object space, as well as a nearest neighbor CAD model, both of which are then used to inform a differentiable Procrustes alignment. Our approach operates in a fully-convolutional fashion, enabling alignment of CAD models to the objects of a scan in a single forward pass. This enables our method to outperform state-of-the-art approaches by $19.04\%$ for CAD model alignment to scans, with $\approx 250\times$ faster runtime than previous data-driven approaches.

Abstract (translated)

我们提出了一种新颖的端到端方法,将CAD模型与场景的三维扫描对齐,从而能够将嘈杂、不完整的三维扫描转换为具有干净、完整的对象几何的紧凑、CAD重建。我们的主要贡献在于建立一个可微的procrustes对齐,与一个对称感知的稠密目标对应预测相匹配。为了同时将CAD模型与扫描场景中的所有对象对齐,我们的方法检测对象位置,然后预测在统一的对象空间中扫描和CAD几何体之间的对称感知密集对象对应关系,以及最近的相邻CAD模型,这两个模型都用于通知可区分的过程对齐。我们的方法以完全卷积的方式运行,使CAD模型能够在一次正向扫描中与扫描对象对齐。这使得我们的方法比最先进的方法要快19.04美元,用于扫描的CAD模型对齐,运行时间比以前的数据驱动方法快大约250倍。

URL

https://arxiv.org/abs/1906.04201

PDF

https://arxiv.org/pdf/1906.04201.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot