Paper Reading AI Learner

Context-Aware Embeddings for Automatic Art Analysis

2019-04-10 02:37:49
Noa Garcia, Benjamin Renoust, Yuta Nakashima

Abstract

Automatic art analysis aims to classify and retrieve artistic representations from a collection of images by using computer vision and machine learning techniques. In this work, we propose to enhance visual representations from neural networks with contextual artistic information. Whereas visual representations are able to capture information about the content and the style of an artwork, our proposed context-aware embeddings additionally encode relationships between different artistic attributes, such as author, school, or historical period. We design two different approaches for using context in automatic art analysis. In the first one, contextual data is obtained through a multi-task learning model, in which several attributes are trained together to find visual relationships between elements. In the second approach, context is obtained through an art-specific knowledge graph, which encodes relationships between artistic attributes. An exhaustive evaluation of both of our models in several art analysis problems, such as author identification, type classification, or cross-modal retrieval, show that performance is improved by up to 7.3% in art classification and 37.24% in retrieval when context-aware embeddings are used.

Abstract (translated)

自动艺术分析的目的是通过使用计算机视觉和机器学习技术从一组图像中分类和检索艺术表现。在这项工作中,我们建议用上下文艺术信息增强神经网络的视觉表现。虽然视觉表现能够捕捉到关于艺术品内容和风格的信息,但我们提出的上下文感知嵌入还编码了不同艺术属性(如作者、学校或历史时期)之间的关系。我们设计了两种在自动艺术分析中使用上下文的不同方法。在第一个模型中,通过多任务学习模型获得上下文数据,其中几个属性一起训练,以发现元素之间的视觉关系。在第二种方法中,上下文是通过一个特定于艺术的知识图获得的,它编码了艺术属性之间的关系。对我们的两个模型在几个艺术分析问题(如作者识别、类型分类或跨模式检索)中的详尽评估表明,当使用上下文感知嵌入时,艺术分类的性能提高了7.3%,检索的性能提高了37.24%。

URL

https://arxiv.org/abs/1904.04985

PDF

https://arxiv.org/pdf/1904.04985.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot