In this project, it looks like they are creating collections of phonetic units, and classifying the donor parameters to build collections. It seems like it would be pretty standard stuff, but there might be some "secret sauce" that I'm unaware of.

Along those lines, I've read a paper where the researcher replaced key phonemes in otherwise generic synthesized output, and was able to overlay the "personality" of the voice. So they might be taking advantage of that as well.

Vocaloid is still the "gold standard" of singing resynthesis, as far as I know, and the process takes months to create a good voice donor. It also helps if you've got native english speakers on the team.

If you're interested, Avanna is currently on sale for $49, and she's probably the best english vocaloid currently on the market. It only comes with the "tiny" editor, which is limited to creating 18 bars at a time, so you'd have to stitch the output together to make a full song. Personally, that's not much of a limitation, because I prefer to build melodies one section at a time.

I've been working for the last couple of weeks trying to get synSinger to automatically extract parameters from audio samples, but have been getting mixed results. If there's anyone with some audio/programming background, I'd love to hear from them!

As far as BiaB goes for vocal synthesis, the MusicXML it creates is pretty broken. I've reported and re-reported bugs (wrong pitches, rests between syllables) over a year ago, and seen no fixes.

It's very frustrating, especially since I'm trying to use BiaB to generate output for synSinger. frown


-- David Cuny
My virtual singer development blog

Vocal control, you say. Never heard of it. Is that some kind of ProTools thing?