Paper Reading AI Learner

ManifoldNet: A Deep Network Framework for Manifold-valued Data

2018-09-11 00:27:48
Rudrasis Chakraborty, Jose Bouza, Jonathan Manton, Baba C. Vemuri

Abstract

Deep neural networks have become the main work horse for many tasks involving learning from data in a variety of applications in Science and Engineering. Traditionally, the input to these networks lie in a vector space and the operations employed within the network are well defined on vector-spaces. In the recent past, due to technological advances in sensing, it has become possible to acquire manifold-valued data sets either directly or indirectly. Examples include but are not limited to data from omnidirectional cameras on automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic resonance imaging, elastography and conductance imaging in the Medical Imaging domain and others. Thus, there is need to generalize the deep neural networks to cope with input data that reside on curved manifolds where vector space operations are not naturally admissible. In this paper, we present a novel theoretical framework to generalize the widely popular convolutional neural networks (CNNs) to high dimensional manifold-valued data inputs. We call these networks, ManifoldNets. In ManifoldNets, convolution operation on data residing on Riemannian manifolds is achieved via a provably convergent recursive computation of the weighted Fr\'{e}chet Mean (wFM) of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence ManifoldNet does not need the non-linear ReLU unit used in standard CNNs. We present experiments, using the ManifoldNet framework, to achieve dimensionality reduction by computing the principal linear subspaces that naturally reside on a Grassmannian. The experimental results demonstrate the efficacy of ManifoldNets in the context of classification and reconstruction accuracy.

Abstract (translated)

深度神经网络已经成为许多任务的主要工作,这些任务涉及从科学和工程的各种应用中学习数据。传统上,这些网络的输入位于向量空间中,并且在网络内使用的操作在向量空间上很好地定义。在最近的过去,由于传感技术的进步,已经可以直接或间接地获取多值数据集。示例包括但不限于来自汽车,无人机等上的全向相机,合成孔径雷达成像,扩散磁共振成像,医学成像领域中的弹性成像和电导成像等的数据。因此,需要推广深度神经网络以处理驻留在弯曲流形上的输入数据,其中向量空间操作不是自然可接受的。在本文中,我们提出了一个新的理论框架,将广泛流行的卷积神经网络(CNN)推广到高维流形值数据输入。我们将这些网络称为ManifoldNets。  在ManifoldNets中,对驻留在黎曼流形上的数据的卷积运算是通过给定数据的加权Fr \'{e} chet Mean(wFM)的可证实收敛的递归计算来实现的,其中权重构成卷积掩模,以便学习。此外,我们证明了所提出的wFM层实现了收缩映射,因此ManifoldNet不需要标准CNN中使用的非线性ReLU单元。我们使用ManifoldNet框架提出实验,通过计算天然存在于格拉斯曼的主线性子空间来实现降维。实验结果证明了ManifoldNets在分类和重建精度方面的功效。

URL

https://arxiv.org/abs/1809.06211

PDF

https://arxiv.org/pdf/1809.06211.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot