Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine

Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general represe...

Full description

Saved in:
Bibliographic Details
Main Authors: Lili Guo, Longbiao Wang, Jianwu Dang, Zhilei Liu, Haotian Guan
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8732399/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849405842378457088
author Lili Guo
Longbiao Wang
Jianwu Dang
Zhilei Liu
Haotian Guan
author_facet Lili Guo
Longbiao Wang
Jianwu Dang
Zhilei Liu
Haotian Guan
author_sort Lili Guo
collection DOAJ
description Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods.
format Article
id doaj-art-b5b74ad333e849c98bc3ff3659bd33c8
institution Kabale University
issn 2169-3536
language English
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-b5b74ad333e849c98bc3ff3659bd33c82025-08-20T03:36:34ZengIEEEIEEE Access2169-35362019-01-017757987580910.1109/ACCESS.2019.29213908732399Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning MachineLili Guo0Longbiao Wang1https://orcid.org/0000-0002-4005-5036Jianwu Dang2Zhilei Liu3https://orcid.org/0000-0003-1447-6256Haotian Guan4Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaHuiyan Technology (Tianjin) Co., Ltd., Tianjin, ChinaPrevious studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods.https://ieeexplore.ieee.org/document/8732399/Speech emotion recognitionauditory-based featuresspectrogram-based featurescomplementary featureskernel extreme learning machine
spellingShingle Lili Guo
Longbiao Wang
Jianwu Dang
Zhilei Liu
Haotian Guan
Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
IEEE Access
Speech emotion recognition
auditory-based features
spectrogram-based features
complementary features
kernel extreme learning machine
title Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
title_full Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
title_fullStr Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
title_full_unstemmed Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
title_short Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine
title_sort exploration of complementary features for speech emotion recognition based on kernel extreme learning machine
topic Speech emotion recognition
auditory-based features
spectrogram-based features
complementary features
kernel extreme learning machine
url https://ieeexplore.ieee.org/document/8732399/
work_keys_str_mv AT liliguo explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine
AT longbiaowang explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine
AT jianwudang explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine
AT zhileiliu explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine
AT haotianguan explorationofcomplementaryfeaturesforspeechemotionrecognitionbasedonkernelextremelearningmachine