Paper Reading AI Learner

Emotion Recognition based on Third-Order Circular Suprasegmental Hidden Markov Model

2019-03-23 11:24:08
Ismail Shahin

Abstract

This work focuses on recognizing the unknown emotion based on the Third-Order Circular Suprasegmental Hidden Markov Model (CSPHMM3) as a classifier. Our work has been tested on Emotional Prosody Speech and Transcripts (EPST) database. The extracted features of EPST database are Mel-Frequency Cepstral Coefficients (MFCCs). Our results give average emotion recognition accuracy of 77.8% based on the CSPHMM3. The results of this work demonstrate that CSPHMM3 is superior to the Third-Order Hidden Markov Model (HMM3), Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and Vector Quantization (VQ) by 6.0%, 4.9%, 3.5%, and 5.4%, respectively, for emotion recognition. The average emotion recognition accuracy achieved based on the CSPHMM3 is comparable to that found using subjective assessment by human judges.

Abstract (translated)

本文以三阶圆上节段隐马尔可夫模型(csphmm3)为分类器,对未知情绪进行识别。我们的工作已经在情绪韵律语言和转录(EPST)数据库上进行了测试。EPST数据库提取的特征是Mel频率倒谱系数(mfcs)。结果表明,基于CSPHMM3的情绪识别平均准确率为77.8%。研究结果表明,CSPHMM3在情感识别方面优于三阶隐马尔可夫模型(HMM3)、高斯混合模型(GMM)、支持向量机(SVM)和矢量量化(VQ),分别提高了6.0%、4.9%、3.5%和5.4%。基于csphmm3获得的平均情绪识别准确度与通过人类法官的主观评估得出的结果相当。

URL

https://arxiv.org/abs/1903.09803

PDF

https://arxiv.org/pdf/1903.09803.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot