Paper Reading AI Learner

Active Scene Learning

2019-03-07 11:07:54
Erelcan Yanik, Tevfik Metin Sezgin

Abstract

Sketch recognition allows natural and efficient interaction in pen-based interfaces. A key obstacle to building accurate sketch recognizers has been the difficulty of creating large amounts of annotated training data. Several authors have attempted to address this issue by creating synthetic data, and by building tools that support efficient annotation. Two prominent sets of approaches stand out from the rest of the crowd. They use interim classifiers trained with a small set of labeled data to aid the labeling of the remainder of the data. The first set of approaches uses a classifier trained with a partially labeled dataset to automatically label unlabeled instances. The others, based on active learning, save annotation effort by giving priority to labeling informative data instances. The former is sub-optimal since it doesn't prioritize the order of labeling to favor informative instances, while the latter makes the strong assumption that unlabeled data comes in an already segmented form (i.e. the ink in the training data is already assembled into groups forming isolated object instances). In this paper, we propose an active learning framework that combines the strengths of these methods, while addressing their weaknesses. In particular, we propose two methods for deciding how batches of unsegmented sketch scenes should be labeled. The first method, scene-wise selection, assesses the informativeness of each drawing (sketch scene) as a whole, and asks the user to annotate all objects in the drawing. The latter, segment-wise selection, attempts more precise targeting to locate informative fragments of drawings for user labeling. We show that both selection schemes outperform random selection. Furthermore, we demonstrate that precise targeting yields superior performance. Overall, our approach allows reaching top accuracy figures with up to 30% savings in annotation cost.

Abstract (translated)

草图识别允许在基于笔的界面中进行自然和有效的交互。构建准确的草图识别器的一个关键障碍是难以创建大量带注释的训练数据。一些作者试图通过创建合成数据和构建支持有效注释的工具来解决这个问题。两套突出的方法在人群中脱颖而出。他们使用临时分类器训练一小组标记数据,以帮助标记其余数据。第一组方法使用经过部分标记的数据集训练的分类器自动标记未标记的实例。其他的基于主动学习,通过优先标记信息性数据实例来节省注释工作。前者是次优的,因为它没有优先顺序的标签有利于信息实例,而后者作出了强有力的假设,未标记的数据以一个已经分割的形式出现(即训练数据中的墨水已经组装成组,形成孤立的对象实例)。在本文中,我们提出了一个积极的学习框架,结合这些方法的优点,同时解决它们的弱点。特别地,我们提出了两种方法来决定如何对未分段的草图场景进行标记。第一种方法是场景选择,评估每个图形(草图场景)作为一个整体的信息性,并要求用户注释图形中的所有对象。后者是分段式选择,试图更精确地定位信息丰富的图形片段,以供用户标记。结果表明,两种选择方案都优于随机选择方案。此外,我们还证明了精确的目标定位可以产生优越的性能。总的来说,我们的方法可以达到最高精度的数字,注释成本最多可节省30%。

URL

https://arxiv.org/abs/1903.02832

PDF

https://arxiv.org/pdf/1903.02832.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot