Clinical Depression and Affect Recognition with EmoAudioNet

2019-11-01 11:38:58

Emna Rejaibi, Daoud Kadoch, Kamil Bentounes, Romain Alfred, Mohamed Daoudi, Abdenour Hadid, Alice Othmani

arXiv_SD

arXiv_SD Recognition Pose Action Emotion Speech

Abstract
Abstract (translated)
URL
PDF

Abstract

Automatic analysis of emotions and affects from speech is an inherently challenging problem with a broad range of applications in Human-Computer Interaction (HCI), health informatics, assistive technologies and multimedia retrieval. Understanding human's specific and basic emotions and reacting accordingly can improve HCI. Besides, giving machines skills to understand human's emotions when interacting with other humans can help humans with a socio-affective intelligence. In this paper, we present a deep Neural Network-based architecture called EmoAudioNet which studies the time-frequency representation of the audio signal and the visual representation of its spectrum of frequencies. Two applications are performed using EmoAudioNet : automatic clinical depression recognition and continuous dimensional emotion recognition from speech. The extensive experiments showed that the proposed approach significantly outperforms the state-of-art approaches on RECOLA and DAIC-WOZ databases. The competitive results call for applying EmoAudioNet on others affects and emotions recognition from speech applications.

Abstract (translated)

URL

https://arxiv.org/abs/1911.00310

PDF

https://arxiv.org/pdf/1911.00310.pdf