Speech Emotion Recognition Method Based on Support Vector Machine and Suprasegmental Acoustic Features
The problem of recognizing emotions in a speech signal using mel-frequency cepstral coefficients using a classifier based on the support vector machine has been studied. The RAVDESS data set was used in the experiments. A model is proposed that uses a 306-component suprasegmental feature vector as i...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | Russian |
| Published: |
Educational institution «Belarusian State University of Informatics and Radioelectronics»
2024-06-01
|
| Series: | Doklady Belorusskogo gosudarstvennogo universiteta informatiki i radioèlektroniki |
| Subjects: | |
| Online Access: | https://doklady.bsuir.by/jour/article/view/3938 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The problem of recognizing emotions in a speech signal using mel-frequency cepstral coefficients using a classifier based on the support vector machine has been studied. The RAVDESS data set was used in the experiments. A model is proposed that uses a 306-component suprasegmental feature vector as input to a support vector machine classifier. Model quality was assessed using unweighted average recall (UAR). The use of linear, polynomial and radial basis functions as a kernel in a classifier based on the support vector machine is considered. The use of different signal analysis frame sizes (from 23 to 341 ms) at the stage of extracting mel-frequency cepstral coefficients was investigated. The research results revealed significant accuracy of the resulting model (UAR = 48 %). The proposed approach shows potential for applications such as voice assistants, virtual agents, and mental health diagnostics. |
|---|---|
| ISSN: | 1729-7648 |