Quote:

In all fairness to Dynavox, their target market is Speech Synthesis, not singing/music.



I think this is a key point. Music requires a lot of things that "ordinary" speech synthesis doesn't. Here is a paper that does a good job explaining it.

The frequency envelope of the voice needs to take into account a number of additional factors:
  • Portamento: Notes that belong to the same word or phrase should be smoothly connected, instead of jumping from note to note.
  • Preparation: Before moving to the next note, the pitch may move in the opposite direction first in preparation of the change.
  • Overshoot: Before hitting the target pitch, the pitch may overshoot the target.
  • Vibrato: Sustained notes will typically have an added vibrato.
  • Fluctuation: Holding the pitch perfectly is unnatural, so some low-level fluctuation needs to be added.
All of these are easily observed when looking at the frequency match line from a pitch correction program. (I left off "undershoot" and "scooping", which I see on my own vocal way too much!)

The paper cited above looks like it's got enough information to implement these features. I'd want logic to prevent vibrato on notes less than a particular duration.

There's also the question of how well their voices will translate to singing. I assume the DynaVox model automatically handles formant preservation since they're synthesizing the voice in the first place. The paper cites the addition of a "singing formant" at about 3kHa, along with amplitude modulation based on vibrato (volume changes along with the vibrato).

Quote:

With no one to keep a fire under them and cattle prod them frequently, the project will remain on the proverbial back burner indefinitely. Therefore, I need someone who can get the project moved OVER the burner. I can provide names and email addresses if anyone wants to accept the challenge.



If the DynaVox people are truly interesting in this, I could prod them.

I don't know how difficult this would be to add to their product, but the basic ideas are pretty straight forward. Basically, it's a matter of creating a dynamic frequency envelope. That'll get you a long way to a more realistic singing voice.

There was a project like this for the free Festival system, but it seems to be mostly dead links now. While Festival is legible, it's not really that pleasant to listen to.


-- David Cuny
My virtual singer development blog

Vocal control, you say. Never heard of it. Is that some kind of ProTools thing?