Paper Reading AI Learner

CT-block: a novel local and global features extractor for point cloud

2021-11-30 13:46:52
Shangwei Guo, Jun Li, Zhengchao Lai, Xiantong Meng, Shaokun Han

Abstract

Deep learning on the point cloud is increasingly developing. Grouping the point with its neighbors and conducting convolution-like operation on them can learn the local feature of the point cloud, but this method is weak to extract the long-distance global feature. Performing the attention-based transformer on the whole point cloud can effectively learn the global feature of it, but this method is hardly to extract the local detailed feature. In this paper, we propose a novel module that can simultaneously extract and fuse local and global features, which is named as CT-block. The CT-block is composed of two branches, where the letter C represents the convolution-branch and the letter T represents the transformer-branch. The convolution-branch performs convolution on the grouped neighbor points to extract the local feature. Meanwhile, the transformer-branch performs offset-attention process on the whole point cloud to extract the global feature. Through the bridge constructed by the feature transmission element in the CT-block, the local and global features guide each other during learning and are fused effectively. We apply the CT-block to construct point cloud classification and segmentation networks, and evaluate the performance of them by several public datasets. The experimental results show that, because the features learned by CT-block are much expressive, the performance of the networks constructed by the CT-block on the point cloud classification and segmentation tasks achieve state of the art.

Abstract (translated)

URL

https://arxiv.org/abs/2111.15400

PDF

https://arxiv.org/pdf/2111.15400.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot