Paper Reading AI Learner

Creativity Inspired Zero-Shot Learning

2019-04-01 21:05:23
Mohamed Elhoseiny, Mohamed Elfeki

Abstract

Zero-shot learning (ZSL) aims at understanding unseen categories with no training examples from class-level descriptions. To improve the discriminative power of zero-shot learning, we model the visual learning process of unseen categories with an inspiration from the psychology of human creativity for producing novel art. We relate ZSL to human creativity by observing that zero-shot learning is about recognizing the unseen and creativity is about creating a likable unseen. We introduce a learning signal inspired by creativity literature that explores the unseen space with hallucinated class-descriptions and encourages careful deviation of their visual feature generations from seen classes while allowing knowledge transfer from seen to unseen classes. Empirically, we show consistent improvement over the state of the art of several percents on the largest available benchmarks on the challenging task or generalized ZSL from a noisy text that we focus on, using the CUB and NABirds datasets. We also show the advantage of our approach on Attribute-based ZSL on three additional datasets (AwA2, aPY, and SUN).

Abstract (translated)

零镜头学习(zsl)的目的是了解不可见的类别,没有从课堂描述的培训例子。为了提高零镜头学习的辨别力,我们以人类创造心理为灵感,对看不见类别的视觉学习过程进行了建模,以产生小说艺术。我们通过观察零镜头学习是识别看不见的东西,而创造力是创造一个可爱的看不见的东西,将zsl与人类创造力联系起来。我们引入了一种学习信号,这种学习信号受到创造性文学的启发,它用幻觉的课堂描述来探索看不见的空间,并鼓励他们小心地将视觉特征代入看不见的课堂,同时允许知识从看不见的课堂转移到看不见的课堂。从经验上讲,我们使用cub和nabirds数据集,在挑战性任务的最大可用基准或从我们关注的嘈杂文本中得到的广义zsl上显示出比当前技术水平高出几个百分点的持续改进。我们还展示了基于属性的zsl方法在三个附加数据集(awa2、apy和sun)上的优势。

URL

https://arxiv.org/abs/1904.01109

PDF

https://arxiv.org/pdf/1904.01109.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot