Paper Reading AI Learner

PaCaNet: A Study on CycleGAN with Transfer Learning for Diversifying Fused Chinese Painting and Calligraphy

2023-01-30 17:22:10
Zuhao Yang, Huajun Bai, Zhang Luo, Yang Xu, Wei Pang, Yue Wang, Yisheng Yuan, Yingfang Yuan

Abstract

AI-Generated Content (AIGC) has recently gained a surge in popularity, powered by its high efficiency and consistency in production, and its capability of being customized and diversified. The cross-modality nature of the representation learning mechanism in most AIGC technology allows for more freedom and flexibility in exploring new types of art that would be impossible in the past. Inspired by the hieroglyph subset of Chinese characters, we proposed PaCaNet, a CycleGAN-based pipeline for producing novel artworks that fuse two different art types, traditional Chinese \emph{painting} and \emph{calligraphy}. In an effort to produce stable and diversified output, we adopted three main technical innovations: 1. Using one-shot learning to increase the creativity of pre-trained models and diversify the content of the fused images. 2. Controlling the preference over generated Chinese calligraphy by freezing randomly sampled parameters in pre-trained models. 3. Using a regularization method to encourage the models to produce images similar to Chinese paintings. Furthermore, we conducted a systematic study to explore the performance of PaCaNet in diversifying fused Chinese painting and calligraphy, which showed satisfying results. In conclusion, we provide a new direction of creating cross-modal art by fusing the visual information in paintings and the linguistic features in Chinese calligraphy. Our approach creates a unique aesthetic experience rooted in the origination of Chinese hieroglyph characters. It is also a unique opportunity to delve deeper into traditional artwork and, in doing so, to create a meaningful impact on preserving and revitalizing traditional heritage.

Abstract (translated)

AI生成内容(AIGC)最近获得了一股流行,其生产效率和一致性在 production 方面的表现以及能够进行定制和多样化的能力是其受欢迎的原因。大多数 AIGC 技术中的表示学习机制具有跨模态性质,这使得探索过去不可能的不同类型的艺术变得更加自由和灵活。受到中文字符中的异读部分启发,我们提出了 PaCaNet,它是一个基于循环GAN的管道,用于生产融合两种不同艺术类型的新艺术作品。为了生产稳定和多样化的输出,我们采用了三个主要的技术创新:1. 使用一次性学习增加训练模型的创造力,并多样化融合图像的内容。2. 控制生成中文书法的偏好,将训练模型的随机参数冻结。3. 使用正则化方法鼓励模型生成类似绘画的图像。此外,我们进行了系统性的研究,以探索 PaCaNet 在多样化融合中文绘画和书法方面的表现,结果令人满意。因此,我们提供了创造跨模态艺术的新方向,通过融合绘画中的视觉信息和中文书法中的语言学特征。我们的 approach 源自中文异读字符的起源,也是深入了解传统艺术作品并在此过程中产生有意义的影响的独特机会。

URL

https://arxiv.org/abs/2301.13082

PDF

https://arxiv.org/pdf/2301.13082.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot