Assessing costa rican children speech recognition by humans and machines

In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis,...

Full description

Saved in:
Bibliographic Details
Main Authors: Maribel Morales-Rodríguez, Marvin Coto-Jiménez
Format: Article
Language:English
Published: Instituto Tecnológico de Costa Rica 2022-11-01
Series:Tecnología en Marcha
Subjects:
Online Access:https://172.20.14.50/index.php/tec_marcha/article/view/6453
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849395462419775488
author Maribel Morales-Rodríguez
Marvin Coto-Jiménez
author_facet Maribel Morales-Rodríguez
Marvin Coto-Jiménez
author_sort Maribel Morales-Rodríguez
collection DOAJ
description In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field.
format Article
id doaj-art-a932958e58ca4d1db92913aff68c5fa9
institution Kabale University
issn 0379-3982
2215-3241
language English
publishDate 2022-11-01
publisher Instituto Tecnológico de Costa Rica
record_format Article
series Tecnología en Marcha
spelling doaj-art-a932958e58ca4d1db92913aff68c5fa92025-08-20T03:39:36ZengInstituto Tecnológico de Costa RicaTecnología en Marcha0379-39822215-32412022-11-0135810.18845/tm.v35i8.6453Assessing costa rican children speech recognition by humans and machinesMaribel Morales-RodríguezMarvin Coto-Jiménez In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field. https://172.20.14.50/index.php/tec_marcha/article/view/6453Children speechspeech recognitionspeech technologiesWER
spellingShingle Maribel Morales-Rodríguez
Marvin Coto-Jiménez
Assessing costa rican children speech recognition by humans and machines
Tecnología en Marcha
Children speech
speech recognition
speech technologies
WER
title Assessing costa rican children speech recognition by humans and machines
title_full Assessing costa rican children speech recognition by humans and machines
title_fullStr Assessing costa rican children speech recognition by humans and machines
title_full_unstemmed Assessing costa rican children speech recognition by humans and machines
title_short Assessing costa rican children speech recognition by humans and machines
title_sort assessing costa rican children speech recognition by humans and machines
topic Children speech
speech recognition
speech technologies
WER
url https://172.20.14.50/index.php/tec_marcha/article/view/6453
work_keys_str_mv AT maribelmoralesrodriguez assessingcostaricanchildrenspeechrecognitionbyhumansandmachines
AT marvincotojimenez assessingcostaricanchildrenspeechrecognitionbyhumansandmachines