Paper Reading AI Learner

Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors

2022-09-30 11:19:25
Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang

Abstract

The performance of existing single-view 3D reconstruction methods heavily relies on large-scale 3D annotations. However, such annotations are tedious and expensive to collect. Semi-supervised learning serves as an alternative way to mitigate the need for manual labels, but remains unexplored in 3D reconstruction. Inspired by the recent success of semi-supervised image classification tasks, we propose SSP3D, a semi-supervised framework for 3D reconstruction. In particular, we introduce an attention-guided prototype shape prior module for guiding realistic object reconstruction. We further introduce a discriminator-guided module to incentivize better shape generation, as well as a regularizer to tolerate noisy training samples. On the ShapeNet benchmark, the proposed approach outperforms previous supervised methods by clear margins under various labeling ratios, (i.e., 1%, 5% , 10% and 20%). Moreover, our approach also performs well when transferring to real-world Pix3D datasets under labeling ratios of 10%. We also demonstrate our method could transfer to novel categories with few novel supervised data. Experiments on the popular ShapeNet dataset show that our method outperforms the zero-shot baseline by over 12% and we also perform rigorous ablations and analysis to validate our approach.

Abstract (translated)

URL

https://arxiv.org/abs/2209.15383

PDF

https://arxiv.org/pdf/2209.15383.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot