CNN Based Automatic Speech Recognition: A Comparative Study

Recently, one of the most common approaches used in speech recognition is deep learning. The most advanced results have been obtained with speech recognition systems created using convolutional neural network (CNN) and recurrent neural networks (RNN). Since CNNs can capture local features effectivel...

Full description

Saved in:
Bibliographic Details
Main Authors: Hilal Ilgaz, Beyza Akkoyun, Özlem Alpay, M. Ali Akcayol
Format: Article
Language:English
Published: Ediciones Universidad de Salamanca 2024-08-01
Series:Advances in Distributed Computing and Artificial Intelligence Journal
Subjects:
Online Access:https://revistas.usal.es/cinco/index.php/2255-2863/article/view/29191
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, one of the most common approaches used in speech recognition is deep learning. The most advanced results have been obtained with speech recognition systems created using convolutional neural network (CNN) and recurrent neural networks (RNN). Since CNNs can capture local features effectively, they are applied to tasks with relatively short-term dependencies, such as keyword detection or phoneme- level sequence recognition. This paper presents the development of a deep learning and speech command recognition system. The Google Speech Commands Dataset has been used for training. The dataset contained 65.000 one-second-long words of 30 short English words. That is, %80 of the dataset has been used in the training and %20 of the dataset has been used in the testing. The data set consists of one-second voice commands that have been converted into a spectrogram and used to train different artificial neural network (ANN) models. Various variants of CNN are used in deep learning applications. The performance of the proposed model has reached %94.60.
ISSN:2255-2863