Prosody annotation for unit selection TTS synthesis

This paper concerns prosody annotation and intonation modeling, especially for the application in a corpus based speech synthesis. In order to establish the rules of the automatic intonation modeling, a four hour fully annotated speech database has been acoustically and perceptually analyzed. The sp...

Full description

Saved in:
Bibliographic Details
Main Authors: Grażyna DEMENKO, Agnieszka WAGNER
Format: Article
Language:English
Published: Institute of Fundamental Technological Research Polish Academy of Sciences 2014-04-01
Series:Archives of Acoustics
Subjects:
Online Access:https://acoustics.ippt.pan.pl/index.php/aa/article/view/763
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper concerns prosody annotation and intonation modeling, especially for the application in a corpus based speech synthesis. In order to establish the rules of the automatic intonation modeling, a four hour fully annotated speech database has been acoustically and perceptually analyzed. The speech material included different text types, dialogs and prosodically rich phrases. As the result of these analyses, a basic prosodic annotation including 6 pitch accent types and 5 types of prosodic phrases have been distinguished. Moreover, the analyses made it possible to define rules for a semi-automatic stylization and parametrization of intonation contours for the application in text-to-speech and speech recognition systems. The assumptions behind the stylization method and results of the quantitative and qualitative evaluation of the stylization accuracy based on the speech consisting of ca. 1000 phrases coming from a literary text read by female and male speakers are discussed. Finally, a classification of pitch accents and boundary tones based on the parameterization is presented.
ISSN:0137-5075
2300-262X