Perception aspects of a rule system for converting melodies from musical notation into sound

The starting point of this project is the "mechanical" impression obtained when a computer rather than a good musician converts the musical notation into the corresponding sound sequences. A basic assumptions is that this mechanical effect can be partly eliminated by introducing "pron...

Full description

Saved in:
Bibliographic Details
Main Authors: L. FRYDEN, J. SUNDBERG, A. ASKENFELT
Format: Article
Language:English
Published: Institute of Fundamental Technological Research Polish Academy of Sciences 2015-07-01
Series:Archives of Acoustics
Online Access:https://acoustics.ippt.pan.pl/index.php/aa/article/view/3074
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The starting point of this project is the "mechanical" impression obtained when a computer rather than a good musician converts the musical notation into the corresponding sound sequences. A basic assumptions is that this mechanical effect can be partly eliminated by introducing "pronunciation rules", which introduce minute, context dependent deviations from the durations, pitches an amplitudes specified in the musical score. The method applied is analysis by synthesis. A computer program developed for the conversion of text to speech (CARLSON and GRANSTROM [3]) is applied to the singing synthesizer MUSSE (LARSSON [4]). In the present, musical version of the conversion program, the input is the melody in musical notation, and the output is the melody, synthetically performed. A set of such pronunciation rules have been formulated for, and tested on traditional Western melodies. One group of rules operates with short time windows of two or three notes. Thus they are applied using as criteria the size of the musical interval formed by adjacent notes, or the difference in duration between two or three adjacent notes. Another type of rules operates with a time window of variable length; the window is limited by the distance between adjacent chord changes. The rules manipulate the duration and amplitude of notes. In other words all these rules introduce discrepancies between what is written in the notation and what is actually played. The effects of these rules are evaluated by means of listening tests with musically sophisticated judges. The results show that the musical acceptability of such synthesized performances can be substantially improved by applying these "pronunciation" rules. Furthermore, our experiences indicate that the magnitude of these discrepancies between what is written and what is supposed to be played is very critical. If the magnitude is so high, that the effect is identified for what it actually is, then the effect generally tends to be musically impossible. If, on the other hand, the magnitude is too small, there is, of course, no effect at all. It seems that the musically useful effects are found in between this threshold of diagnosis and the threshold of perception. The perceptual implications of these rules regarding musical communication will be discussed, and the effects of the rules may be demonstrated by means of tape illustrations.
ISSN:0137-5075
2300-262X