Exploiting Vietnamese Social Media Characteristics for Textual Emotion Recognition in Vietnamese

2020-09-23 08:49:39

Khang Phuoc-Quy Nguyen, Kiet Van Nguyen

arXiv_CL

arXiv_CL Recognition Emotion

Abstract
Abstract (translated)
URL
PDF

Abstract

Textual emotion recognition has been a promising research topic in recent years. Many researchers were trying to build a perfect automated system capable of detecting correct human emotion from text data. In this paper, we conducted several experiments to indicate how the data pre-processing affects a machine learning method on textual emotion recognition. These experiments were performed on the benchmark dataset Vietnamese Social Media Emotion Corpus (UIT-VSMEC). We explored Vietnamese social media characteristics to clean the data, and then we extracted essential phrases that are likely to contain emotional context. Our experimental evaluation shows that with appropriate pre-processing techniques, Multinomial Logistic Regression (MLR) achieves the best F1-score of 64.40%, a significant improvement of 4.66% over the CNN model built by the authors of UIT-VSMEC (59.74%).

Abstract (translated)

URL

https://arxiv.org/abs/2009.11005

PDF

https://arxiv.org/pdf/2009.11005.pdf