Paper Reading AI Learner

Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation

2019-03-25 16:12:20
Jaime Spencer, Richard Bowden, Simon Hadfield

Abstract

How do computers and intelligent agents view the world around them? Feature extraction and representation constitutes one the basic building blocks towards answering this question. Traditionally, this has been done with carefully engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is no ``one size fits all'' approach that satisfies all requirements. In recent years, the rising popularity of deep learning has resulted in a myriad of end-to-end solutions to many computer vision problems. These approaches, while successful, tend to lack scalability and can't easily exploit information learned by other systems. Instead, we propose SAND features, a dedicated deep learning solution to feature extraction capable of providing hierarchical context information. This is achieved by employing sparse relative labels indicating relationships of similarity/dissimilarity between image locations. The nature of these labels results in an almost infinite set of dissimilar examples to choose from. We demonstrate how the selection of negative examples during training can be used to modify the feature space and vary it's properties. To demonstrate the generality of this approach, we apply the proposed features to a multitude of tasks, each requiring different properties. This includes disparity estimation, semantic segmentation, self-localisation and SLAM. In all cases, we show how incorporating SAND features results in better or comparable results to the baseline, whilst requiring little to no additional training. Code can be found at: https://github.com/jspenmar/SAND_features

Abstract (translated)

计算机和智能代理如何看待他们周围的世界?特征提取和表示是回答这一问题的基本构件之一。传统上,这是用精心设计的手工技术,如猪,筛或球。但是,没有“一刀切”的方法能够满足所有的需求。近年来,深度学习的日益普及导致了许多计算机视觉问题的端到端解决方案。这些方法虽然成功,但往往缺乏可扩展性,并且不容易利用其他系统所学习到的信息。相反,我们提出了sand-features,这是一种专用的深度学习解决方案,用于特征提取,能够提供层次上下文信息。这是通过使用稀疏的相对标签来实现的,这些标签指示图像位置之间的相似性/差异性关系。这些标签的性质导致了可以从中选择的几乎无限多个不同的例子。我们演示了如何在训练期间选择负面的例子来修改特征空间并改变其属性。为了证明这种方法的通用性,我们将所建议的特性应用于多个任务,每个任务都需要不同的属性。这包括差异估计、语义分割、自我定位和SLAM。在所有情况下,我们都展示了如何将砂特征结合在一起,以获得更好或与基线相当的结果,同时几乎不需要额外的培训。代码位于:https://github.com/jspenmar/sand_features

URL

https://arxiv.org/abs/1903.10427

PDF

https://arxiv.org/pdf/1903.10427.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot