Exploration of Complementary Features for Speech Emotion Recognition Based on Kernel Extreme Learning Machine

Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general represe...

Full description

Saved in:
Bibliographic Details
Main Authors: Lili Guo, Longbiao Wang, Jianwu Dang, Zhilei Liu, Haotian Guan
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8732399/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Previous studies of speech emotion recognition using either empirical features (e.g., F0, energy, and voice probability) or spectrogram-based statistical features. The empirical features can highlight the human knowledge of emotion recognition, while the statistical features enable a general representation, but they do not emphasize human knowledge sufficiently. However, the use of these two kinds of features together can complement some features that may be unconsciously used by humans in daily life but have not been realized yet. Based on this consideration, this paper proposes a dynamic fusion framework to utilize the potential advantages of the complementary spectrogram-based statistical features and the auditory-based empirical features. In addition, a kernel extreme learning machine (KELM) is adopted as the classifier to distinguish emotions. To validate the proposed framework, we conduct experiments on two public emotional databases, including Emo-DB and IEMOCAP databases. The experimental results demonstrate that the proposed fusion framework significantly outperforms the existing state-of-the-art methods. The results also show that the proposed method, by integrating the auditory-based features with spectrogram-based features, could achieve a notably improved performance over the conventional methods.
ISSN:2169-3536