Paper Reading AI Learner

Second-order Democratic Aggregation

2018-08-22 18:07:26
Tsung-Yu Lin, Subhransu Maji, Piotr Koniusz

Abstract

Aggregated second-order features extracted from deep convolutional networks have been shown to be effective for texture generation, fine-grained recognition, material classification, and scene understanding. In this paper, we study a class of orderless aggregation functions designed to minimize interference or equalize contributions in the context of second-order features and we show that they can be computed just as efficiently as their first-order counterparts and they have favorable properties over aggregation by summation. Another line of work has shown that matrix power normalization after aggregation can significantly improve the generalization of second-order representations. We show that matrix power normalization implicitly equalizes contributions during aggregation thus establishing a connection between matrix normalization techniques and prior work on minimizing interference. Based on the analysis we present {\gamma}-democratic aggregators that interpolate between sum ({\gamma}=1) and democratic pooling ({\gamma}=0) outperforming both on several classification tasks. Moreover, unlike power normalization, the {\gamma}-democratic aggregations can be computed in a low dimensional space by sketching that allows the use of very high-dimensional second-order features. This results in a state-of-the-art performance on several datasets.

Abstract (translated)

从深度卷积网络中提取的聚合二阶特征已被证明对纹理生成,细粒度识别,材料分类和场景理解是有效的。在本文中,我们研究了一类无序聚合函数,旨在最小化干扰或在二阶特征的上下文中均衡贡献,并且我们证明它们可以像它们的一阶对应物一样有效地计算,并且它们具有有利的性质。通过求和进行聚合。另一项工作表明,聚合后的矩阵功率归一化可以显着改善二阶表示的泛化。我们表明矩阵功率归一化隐含地在聚合期间均衡贡献,从而在矩阵归一化技术和先前关于最小化干扰的工作之间建立连接。根据分析,我们提出了{\ gamma} - 民主聚合器,它们在sum({\ gamma} = 1)和民主汇集({\ gamma} = 0)之间进行插值,优于两个分类任务。此外,与功率归一化不同,{\ gamma} - 民主聚合可以通过草图在低维空间中计算,允许使用非常高维的二阶特征。这样可以在多个数据集上实现最先进的性能。

URL

https://arxiv.org/abs/1808.07503

PDF

https://arxiv.org/pdf/1808.07503.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot