Paper Reading AI Learner

A Hybrid Generative and Discriminative PointNet on Unordered Point Sets

2024-04-19 14:52:25
Yang Ye, Shihao Ji

Abstract

As point cloud provides a natural and flexible representation usable in myriad applications (e.g., robotics and self-driving cars), the ability to synthesize point clouds for analysis becomes crucial. Recently, Xie et al. propose a generative model for unordered point sets in the form of an energy-based model (EBM). Despite the model achieving an impressive performance for point cloud generation, one separate model needs to be trained for each category to capture the complex point set distributions. Besides, their method is unable to classify point clouds directly and requires additional fine-tuning for classification. One interesting question is: Can we train a single network for a hybrid generative and discriminative model of point clouds? A similar question has recently been answered in the affirmative for images, introducing the framework of Joint Energy-based Model (JEM), which achieves high performance in image classification and generation simultaneously. This paper proposes GDPNet, the first hybrid Generative and Discriminative PointNet that extends JEM for point cloud classification and generation. Our GDPNet retains strong discriminative power of modern PointNet classifiers, while generating point cloud samples rivaling state-of-the-art generative approaches.

Abstract (translated)

点云作为一种自然且灵活的表示形式,可用于各种应用场景(例如机器人学和自动驾驶汽车),因此生成点云用于分析的能力变得至关重要。最近,Xie等人提出了一种基于能量的点集生成模型(EBM)来解决无序点集的生成问题。尽管该模型在点云生成方面取得了令人印象深刻的性能,但每个类别都需要单独训练一个模型来捕获复杂的点集分布。此外,他们的方法无法直接对点云进行分类,需要进行额外的微调来进行分类。一个有趣的问题就是:我们能否为点云的混合生成和判别模型训练一个单一的神经网络?与图像类似,最近已经有人回答了这个问题,引入了基于联合能量的模型(JEM)框架,该框架在同时实现图像分类和生成方面取得了高绩效。本文提出了GDPNet,这是第一个将JEM扩展到点云分类和生成的混合生成和判别点网络。我们的GDPNet保留了现代点网分类器的强大判别能力,同时生成点云样本,与最先进的生成方法媲美。

URL

https://arxiv.org/abs/2404.12925

PDF

https://arxiv.org/pdf/2404.12925.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot