Paper Reading AI Learner

FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings

2024-05-14 17:11:33
Nancy Hada, Aditya Singh, Kavita Vemuri

Abstract

Indian folk paintings have a rich mosaic of symbols, colors, textures, and stories making them an invaluable repository of cultural legacy. The paper presents a novel approach to classifying these paintings into distinct art forms and tagging them with their unique salient features. A custom dataset named FolkTalent, comprising 2279 digital images of paintings across 12 different forms, has been prepared using websites that are direct outlets of Indian folk paintings. Tags covering a wide range of attributes like color, theme, artistic style, and patterns are generated using GPT4, and verified by an expert for each painting. Classification is performed employing the RandomForest ensemble technique on fine-tuned Convolutional Neural Network (CNN) models to classify Indian folk paintings, achieving an accuracy of 91.83%. Tagging is accomplished via the prominent fine-tuned CNN-based backbones with a custom classifier attached to its top to perform multi-label image classification. The generated tags offer a deeper insight into the painting, enabling an enhanced search experience based on theme and visual attributes. The proposed hybrid model sets a new benchmark in folk painting classification and tagging, significantly contributing to cataloging India's folk-art heritage.

Abstract (translated)

印度民间绘画具有丰富的象征、色彩、纹理和故事,使其成为文化遗产的无价财富。本文提出了一种新颖的方法来将这些绘画分类为不同的艺术形式,并为它们独特的突出特点贴上标签。一个由12种形式、共2279幅绘画图片组成的自定义数据集FolkTalent已经准备就绪,这些图片来源于印度民间绘画的直接网站。使用GPT4生成覆盖颜色、主题、艺术风格和图案等广泛属性的标签,并请专家对每幅绘画进行验证。分类采用随机森林技术对经过微调的卷积神经网络(CNN)模型进行,实现91.83%的准确率。标签通过一个显著地进行微调的CNN骨干网络与自定义分类器连接在一起进行多标签图像分类。生成的标签提供了对绘画的更深刻的洞察,使主题和视觉属性能够成为增强的搜索体验。所提出的混合模型在民间绘画分类和标签方面设定了新的基准,显著地贡献了印度民间艺术遗产的目录。

URL

https://arxiv.org/abs/2405.08776

PDF

https://arxiv.org/pdf/2405.08776.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot