Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general represe...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2019-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/8732399/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849405842378457088 |
|---|---|
| author | Lili Guo Longbiao Wang Jianwu Dang Zhilei Liu Haotian Guan |
| author_facet | Lili Guo Longbiao Wang Jianwu Dang Zhilei Liu Haotian Guan |
| author_sort | Lili Guo |
| collection | DOAJ |
| description | Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods. |
| format | Article |
| id | doaj-art-b5b74ad333e849c98bc3ff3659bd33c8 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2019-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-b5b74ad333e849c98bc3ff3659bd33c82025-08-20T03:36:34ZengIEEEIEEE Access2169-35362019-01-017757987580910.1109/ACCESS.2019.29213908732399Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning MachineLili Guo0Longbiao Wang1https://orcid.org/0000-0002-4005-5036Jianwu Dang2Zhilei Liu3https://orcid.org/0000-0003-1447-6256Haotian Guan4Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaHuiyan Technology (Tianjin) Co., Ltd., Tianjin, ChinaPrevious studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods.https://ieeexplore.ieee.org/document/8732399/Speech emotion recognitionauditory-based featuresspectrogram-based featurescomplementary featureskernel extreme learning machine |
| spellingShingle | Lili Guo Longbiao Wang Jianwu Dang Zhilei Liu Haotian Guan Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine IEEE Access Speech emotion recognition auditory-based features spectrogram-based features complementary features kernel extreme learning machine |
| title | Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine |
| title_full | Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine |
| title_fullStr | Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine |
| title_full_unstemmed | Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine |
| title_short | Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine |
| title_sort | exploration of complementary features for speech emotion recognition based on kernel extreme learning machine |
| topic | Speech emotion recognition auditory-based features spectrogram-based features complementary features kernel extreme learning machine |
| url | https://ieeexplore.ieee.org/document/8732399/ |
| work_keys_str_mv | AT liliguo explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine AT longbiaowang explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine AT jianwudang explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine AT zhileiliu explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine AT haotianguan explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine |