Paper Reading AI Learner

Cell Variational Information Bottleneck Network

2024-03-22 10:06:31
Zhonghua Zhai, Chen Ju, Jinsong Lan, Shuai Xiao

Abstract

In this work, we propose Cell Variational Information Bottleneck Network (cellVIB), a convolutional neural network using information bottleneck mechanism, which can be combined with the latest feedforward network architecture in an end-to-end training method. Our Cell Variational Information Bottleneck Network is constructed by stacking VIB cells, which generate feature maps with uncertainty. As layers going deeper, the regularization effect will gradually increase, instead of directly adding excessive regular constraints to the output layer of the model as in Deep VIB. Under each VIB cell, the feedforward process learns an independent mean term and an standard deviation term, and predicts the Gaussian distribution based on them. The feedback process is based on reparameterization trick for effective training. This work performs an extensive analysis on MNIST dataset to verify the effectiveness of each VIB cells, and provides an insightful analysis on how the VIB cells affect mutual information. Experiments conducted on CIFAR-10 also prove that our cellVIB is robust against noisy labels during training and against corrupted images during testing. Then, we validate our method on PACS dataset, whose results show that the VIB cells can significantly improve the generalization performance of the basic model. Finally, in a more complex representation learning task, face recognition, our network structure has also achieved very competitive results.

Abstract (translated)

在这项工作中,我们提出了Cell Variational Information Bottleneck Network (cellVIB),一种使用信息 bottleneck机制的卷积神经网络,可以与端到端训练方法中最新的前馈网络架构相结合。我们的cellVIB是由VIB单元格堆叠而成的,它们生成了具有不确定性的特征图。随着层数的加深,正则化效应将逐渐增加,而不是直接向模型的输出层添加过度的正则约束,就像在Deep VIB中一样。在每個VIB单元格中,前馈过程学习到一个独立均值和一个标准差,并基于它们预测高斯分布。反馈过程基于参数重排技巧,用于有效的训练。这项工作对MNIST数据集进行了广泛的分析,以验证每个VIB单元的有效性,并提供了关于VIB单元如何影响互信息的有洞察性的分析。在CIFAR-10数据集上进行实验也证明,我们的cellVIB在训练过程中对噪音标签和测试过程中的损坏图像具有鲁棒性。然后,我们在PACS数据集上验证了我们的方法,该数据集的结果表明,VIB单元可以显著提高基本模型的泛化性能。最后,在更加复杂的表示学习任务中,例如面部识别,我们的网络结构也取得了非常竞争力的结果。

URL

https://arxiv.org/abs/2403.15082

PDF

https://arxiv.org/pdf/2403.15082.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot