Paper Reading AI Learner

Emoji Prediction: Extensions and Benchmarking

2020-07-14 22:41:20
Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi

Abstract

Emojis are a succinct form of language which can express concrete meanings, emotions, and intentions. Emojis also carry signals that can be used to better understand communicative intent. They have become a ubiquitous part of our daily lives, making them an important part of understanding user-generated content. The emoji prediction task aims at predicting the proper set of emojis associated with a piece of text. Through emoji prediction, models can learn rich representations of the communicative intent of the written text. While existing research on the emoji prediction task focus on a small subset of emoji types closely related to certain emotions, this setting oversimplifies the task and wastes the expressive power of emojis. In this paper, we extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification on the task. We propose novel models for multi-class and multi-label emoji prediction based on Transformer networks. We also construct multiple emoji prediction datasets from Twitter using heuristics. The BERT models achieve state-of-the-art performances on all our datasets under all the settings, with relative improvements of 27.21% to 236.36% in accuracy, 2.01% to 88.28% in top-5 accuracy and 65.19% to 346.79% in F-1 score, compared to the prior state-of-the-art. Our results demonstrate the efficacy of deep Transformer-based models on the emoji prediction task. We also release our datasets at this https URL for future researchers.

Abstract (translated)

URL

https://arxiv.org/abs/2007.07389

PDF

https://arxiv.org/pdf/2007.07389.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot