Paper Reading AI Learner

Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach

2023-01-16 11:37:05
Yassir Bendou, Lucas Drumetz, Vincent Gripon, Giulia Lioi, Bastien Pasdeloup

Abstract

The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings.

Abstract (translated)

URL

https://arxiv.org/abs/2301.06372

PDF

https://arxiv.org/pdf/2301.06372.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot