Paper Reading AI Learner

CIZSL++: Creativity Inspired Generative Zero-Shot Learning

2021-01-01 05:47:57
Mohamed Elhoseiny, Kai Yi, Mohamed Elfeki

Abstract

Zero-shot learning (ZSL) aims at understanding unseen categories with no training examples from class-level descriptions. To improve the discriminative power of ZSL, we model the visual learning process of unseen categories with inspiration from the psychology of human creativity for producing novel art. First, we propose CIZSL-v1 as a creativity inspired model for generative ZSL. We relate ZSL to human creativity by observing that ZSL is about recognizing the unseen, and creativity is about creating a likable unseen. We introduce a learning signal inspired by creativity literature that explores the unseen space with hallucinated class-descriptions and encourages careful deviation of their visual feature generations from seen classes while allowing knowledge transfer from seen to unseen classes. Second, CIZSL-v2 is proposed as an improved version of CIZSL-v1 for generative zero-shot learning. CIZSL-v2 consists of an investigation of additional inductive losses for unseen classes along with a semantic guided discriminator. Empirically, we show consistently that CIZSL losses can improve generative ZSL models on the challenging task of generalized ZSL from a noisy text on CUB and NABirds datasets. We also show the advantage of our approach to Attribute-based ZSL on AwA2, aPY, and SUN datasets. We also show that CIZSL-v2 has improved performance compared to CIZSL-v1.

Abstract (translated)

URL

https://arxiv.org/abs/2101.00173

PDF

https://arxiv.org/pdf/2101.00173.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot