Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription

Orthographic-To-Phonetic (O2P) Transcription is the process of learning the relationship between the written word and its phonetic transcription. It is a necessary part of Text-To-Speech (TTS) systems and it plays an important role in handling Out-Of-Vocabulary (OOV) words in Automatic Speech Recogn...

Full description

Saved in:
Bibliographic Details
Main Authors: El-Hadi CHERIFI, Mhania GUERTI
Format: Article
Language:English
Published: Institute of Fundamental Technological Research Polish Academy of Sciences 2021-06-01
Series:Archives of Acoustics
Subjects:
Online Access:https://acoustics.ippt.pan.pl/index.php/aa/article/view/2788
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849693317920456704
author El-Hadi CHERIFI
Mhania GUERTI
author_facet El-Hadi CHERIFI
Mhania GUERTI
author_sort El-Hadi CHERIFI
collection DOAJ
description Orthographic-To-Phonetic (O2P) Transcription is the process of learning the relationship between the written word and its phonetic transcription. It is a necessary part of Text-To-Speech (TTS) systems and it plays an important role in handling Out-Of-Vocabulary (OOV) words in Automatic Speech Recognition systems. The O2P is a complex task, because for many languages, the correspondence between the orthography and its phonetic transcription is not completely consistent. Over time, the techniques used to tackle this problem have evolved, from earlier rules based systems to the current more sophisticated machine learning approaches. In this paper, we propose an approach for Arabic O2P Conversion based on a probabilistic method: Conditional Random Fields (CRF). We discuss the results and experiments of this method apply on a pronunciation dictionary of the Most Commonly used Arabic Words, a database that we called (MCAW-Dic). MCAW-Dic contains over 35 000 words in Modern Standard Arabic (MSA) and their pronunciation, a database that we have developed by ourselves assisted by phoneticians and linguists from the University of Tlemcen. The results achieved are very satisfactory and point the way towards future innovations. Indeed, in all our tests, the score was between 11 and 15% error rate on the transcription of phonemes (Phoneme Error Rate). We could improve this result by including a large context, but in this case, we encountered memory limitations and calculation difficulties.
format Article
id doaj-art-ae7ced46a0f74accba2dd9bd59e1a9a4
institution DOAJ
issn 0137-5075
2300-262X
language English
publishDate 2021-06-01
publisher Institute of Fundamental Technological Research Polish Academy of Sciences
record_format Article
series Archives of Acoustics
spelling doaj-art-ae7ced46a0f74accba2dd9bd59e1a9a42025-08-20T03:20:27ZengInstitute of Fundamental Technological Research Polish Academy of SciencesArchives of Acoustics0137-50752300-262X2021-06-0146210.24425/aoa.2021.136574Conditional Random Fields Applied to Arabic Orthographic-Phonetic TranscriptionEl-Hadi CHERIFI0Mhania GUERTI1National Polytechnic SchoolNational Polytechnic SchoolOrthographic-To-Phonetic (O2P) Transcription is the process of learning the relationship between the written word and its phonetic transcription. It is a necessary part of Text-To-Speech (TTS) systems and it plays an important role in handling Out-Of-Vocabulary (OOV) words in Automatic Speech Recognition systems. The O2P is a complex task, because for many languages, the correspondence between the orthography and its phonetic transcription is not completely consistent. Over time, the techniques used to tackle this problem have evolved, from earlier rules based systems to the current more sophisticated machine learning approaches. In this paper, we propose an approach for Arabic O2P Conversion based on a probabilistic method: Conditional Random Fields (CRF). We discuss the results and experiments of this method apply on a pronunciation dictionary of the Most Commonly used Arabic Words, a database that we called (MCAW-Dic). MCAW-Dic contains over 35 000 words in Modern Standard Arabic (MSA) and their pronunciation, a database that we have developed by ourselves assisted by phoneticians and linguists from the University of Tlemcen. The results achieved are very satisfactory and point the way towards future innovations. Indeed, in all our tests, the score was between 11 and 15% error rate on the transcription of phonemes (Phoneme Error Rate). We could improve this result by including a large context, but in this case, we encountered memory limitations and calculation difficulties.https://acoustics.ippt.pan.pl/index.php/aa/article/view/2788Orthographic-To-Phonetic TranscriptionConditional Random Fieldstext-to-speechArabic speech synthesisModern Standard Arabic
spellingShingle El-Hadi CHERIFI
Mhania GUERTI
Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
Archives of Acoustics
Orthographic-To-Phonetic Transcription
Conditional Random Fields
text-to-speech
Arabic speech synthesis
Modern Standard Arabic
title Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
title_full Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
title_fullStr Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
title_full_unstemmed Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
title_short Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription
title_sort conditional random fields applied to arabic orthographic phonetic transcription
topic Orthographic-To-Phonetic Transcription
Conditional Random Fields
text-to-speech
Arabic speech synthesis
Modern Standard Arabic
url https://acoustics.ippt.pan.pl/index.php/aa/article/view/2788
work_keys_str_mv AT elhadicherifi conditionalrandomfieldsappliedtoarabicorthographicphonetictranscription
AT mhaniaguerti conditionalrandomfieldsappliedtoarabicorthographicphonetictranscription