Assessing costa rican children speech recognition by humans and machines
In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis,...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Instituto Tecnológico de Costa Rica
2022-11-01
|
| Series: | Tecnología en Marcha |
| Subjects: | |
| Online Access: | https://172.20.14.50/index.php/tec_marcha/article/view/6453 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849395462419775488 |
|---|---|
| author | Maribel Morales-Rodríguez Marvin Coto-Jiménez |
| author_facet | Maribel Morales-Rodríguez Marvin Coto-Jiménez |
| author_sort | Maribel Morales-Rodríguez |
| collection | DOAJ |
| description |
In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field.
|
| format | Article |
| id | doaj-art-a932958e58ca4d1db92913aff68c5fa9 |
| institution | Kabale University |
| issn | 0379-3982 2215-3241 |
| language | English |
| publishDate | 2022-11-01 |
| publisher | Instituto Tecnológico de Costa Rica |
| record_format | Article |
| series | Tecnología en Marcha |
| spelling | doaj-art-a932958e58ca4d1db92913aff68c5fa92025-08-20T03:39:36ZengInstituto Tecnológico de Costa RicaTecnología en Marcha0379-39822215-32412022-11-0135810.18845/tm.v35i8.6453Assessing costa rican children speech recognition by humans and machinesMaribel Morales-RodríguezMarvin Coto-Jiménez In recent years, an increasing number of studies on human-computer interaction is taking place, due to the pervasive speech interfaces implemented in systems such as cell phones, personal and home automation assistants. These studies include automatic speech recognition (ASR) and speech synthesis, and are considering a wider variety of conditions of the signals, such as noise and reverberation, and accents and age-related effects as well. For example, one of the key challenges is the development of ASR for children’s speech. Since the current systems have a dependency on language and accents, thus, to improve it, the investigations of speech recognition technologies suitable for children are needed. In this paper, we assess commercial ASR systems for the recognition of Costa Rican children’s speech, for users with ages ranging between three and fourteen years old. To establish a comparison and numeric validation of the ASR systems in recognizing children’s isolated words, we conducted a large subjective listening test that computes the differences and challenges that remains for the state-of-the art ASR systems. The results provide evident numeric differences between ASR systems and human perceptions, especially for younger children. Additionally, we provide suggestions for future research directions in the field. https://172.20.14.50/index.php/tec_marcha/article/view/6453Children speechspeech recognitionspeech technologiesWER |
| spellingShingle | Maribel Morales-Rodríguez Marvin Coto-Jiménez Assessing costa rican children speech recognition by humans and machines Tecnología en Marcha Children speech speech recognition speech technologies WER |
| title | Assessing costa rican children speech recognition by humans and machines |
| title_full | Assessing costa rican children speech recognition by humans and machines |
| title_fullStr | Assessing costa rican children speech recognition by humans and machines |
| title_full_unstemmed | Assessing costa rican children speech recognition by humans and machines |
| title_short | Assessing costa rican children speech recognition by humans and machines |
| title_sort | assessing costa rican children speech recognition by humans and machines |
| topic | Children speech speech recognition speech technologies WER |
| url | https://172.20.14.50/index.php/tec_marcha/article/view/6453 |
| work_keys_str_mv | AT maribelmoralesrodriguez assessingcostaricanchildrenspeechrecognitionbyhumansandmachines AT marvincotojimenez assessingcostaricanchildrenspeechrecognitionbyhumansandmachines |