Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition

Signal processing in general, and speech emotion recognition in particular, have long been familiar Artificial Intelligence (AI) tasks. With the explosion of deep learning, CNN models are used more frequently, accompanied by the emergence of many signal transformations. However, these methods often...

Full description

Saved in:
Bibliographic Details
Main Author: Bao Long Ly
Format: Article
Language:English
Published: KeAi Communications Co. Ltd. 2024-01-01
Series:Cognitive Robotics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667241324000168
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850255640869470208
author Bao Long Ly
author_facet Bao Long Ly
author_sort Bao Long Ly
collection DOAJ
description Signal processing in general, and speech emotion recognition in particular, have long been familiar Artificial Intelligence (AI) tasks. With the explosion of deep learning, CNN models are used more frequently, accompanied by the emergence of many signal transformations. However, these methods often require significant hardware and runtime. In an effort to address these issues, we analyze and learn from existing transformations, leading us to propose a new method: Fourier Hilbert Transformation (FHT). In general, this method applies the Hilbert curve to Fourier images. The resulting images are small and dense, which is a shape well-suited to the CNN architecture. Additionally, the better distribution of information on the image allows the filters to fully utilize their power. These points support the argument that FHT provides an optimal input for CNN. Experiments conducted on popular datasets yielded promising results. FHT saves a large amount of hardware usage and runtime while maintaining high performance, even offers greater stability compared to existing methods. This opens up opportunities for deploying signal processing tasks on real-time systems with limited hardware.
format Article
id doaj-art-9e65b5fc189f4f8a918c02817453a658
institution OA Journals
issn 2667-2413
language English
publishDate 2024-01-01
publisher KeAi Communications Co. Ltd.
record_format Article
series Cognitive Robotics
spelling doaj-art-9e65b5fc189f4f8a918c02817453a6582025-08-20T01:56:49ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132024-01-01422823610.1016/j.cogr.2024.11.002Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognitionBao Long Ly0FPT University, Ho Chi Minh, Viet NamSignal processing in general, and speech emotion recognition in particular, have long been familiar Artificial Intelligence (AI) tasks. With the explosion of deep learning, CNN models are used more frequently, accompanied by the emergence of many signal transformations. However, these methods often require significant hardware and runtime. In an effort to address these issues, we analyze and learn from existing transformations, leading us to propose a new method: Fourier Hilbert Transformation (FHT). In general, this method applies the Hilbert curve to Fourier images. The resulting images are small and dense, which is a shape well-suited to the CNN architecture. Additionally, the better distribution of information on the image allows the filters to fully utilize their power. These points support the argument that FHT provides an optimal input for CNN. Experiments conducted on popular datasets yielded promising results. FHT saves a large amount of hardware usage and runtime while maintaining high performance, even offers greater stability compared to existing methods. This opens up opportunities for deploying signal processing tasks on real-time systems with limited hardware.http://www.sciencedirect.com/science/article/pii/S2667241324000168Speech emotion recognitionCNNInput transformationEnhancingFourier transformationHilbert curve
spellingShingle Bao Long Ly
Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
Cognitive Robotics
Speech emotion recognition
CNN
Input transformation
Enhancing
Fourier transformation
Hilbert curve
title Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
title_full Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
title_fullStr Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
title_full_unstemmed Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
title_short Fourier Hilbert: The input transformation to enhance CNN models for speech emotion recognition
title_sort fourier hilbert the input transformation to enhance cnn models for speech emotion recognition
topic Speech emotion recognition
CNN
Input transformation
Enhancing
Fourier transformation
Hilbert curve
url http://www.sciencedirect.com/science/article/pii/S2667241324000168
work_keys_str_mv AT baolongly fourierhilberttheinputtransformationtoenhancecnnmodelsforspeechemotionrecognition