PG Music Home
Fellow home recordists:

I found out about this project through a TED talk. It's an initiative to vastly expand the access to and variety of recorded human speech for use for those that require speech synthesis/assistance.

The main project page: https://vocalid.co/

The page where we can donate our voices: https://vocalid.co/voicebank

So many of us have better than average recording setups at home - let's band together to provide well-recorded speech to those that need it.

-Scott
Thanks, Scott!

Folks marvel at the capabilities of my Dynavox Maestro(r) Speech Synthesis Device but even including the recent advancements in DSP (Digital Speech Processing), we both know that the technology is in an infantile state of development and progressing exponentially. What is possible today will be outdone in a few months. Already, my Maestro can be completely controlled by eye movements without connections.

I am privileged to see some R&D projects currently under development and they are astonishing.

I demonstrate the device at VA Medical Centers for veterans who have become speech deprived. To think that we will soon SING is something so wonderful.
I'm envisioning the impact that Scott, Dr Gannon, Matt Finley, David Cuny, and others would have one this technology. Wow! I might sing again!

Dr Donald K. Reynolds, Dean of the College of Engineering at UW, said that a primary purpose of higher education is to prevent us from "...trying to reinvent the wheel!" So it behooves us to know and understand the current state-of-the-art. Perhaps Scott, as an Audio Engineer, would be best qualified for that role.
Thanks for posting this. I put it on my Facebook page with a shout-out to the singers, actors and voiceover folks I know to get involved, which some have apparently done.

Cheers,

Ed
In this project, it looks like they are creating collections of phonetic units, and classifying the donor parameters to build collections. It seems like it would be pretty standard stuff, but there might be some "secret sauce" that I'm unaware of.

Along those lines, I've read a paper where the researcher replaced key phonemes in otherwise generic synthesized output, and was able to overlay the "personality" of the voice. So they might be taking advantage of that as well.

Vocaloid is still the "gold standard" of singing resynthesis, as far as I know, and the process takes months to create a good voice donor. It also helps if you've got native english speakers on the team.

If you're interested, Avanna is currently on sale for $49, and she's probably the best english vocaloid currently on the market. It only comes with the "tiny" editor, which is limited to creating 18 bars at a time, so you'd have to stitch the output together to make a full song. Personally, that's not much of a limitation, because I prefer to build melodies one section at a time.

I've been working for the last couple of weeks trying to get synSinger to automatically extract parameters from audio samples, but have been getting mixed results. If there's anyone with some audio/programming background, I'd love to hear from them!

As far as BiaB goes for vocal synthesis, the MusicXML it creates is pretty broken. I've reported and re-reported bugs (wrong pitches, rests between syllables) over a year ago, and seen no fixes.

It's very frustrating, especially since I'm trying to use BiaB to generate output for synSinger. frown
David, thanks for that info...Avanna is very interesting! Do you know if it is limited unless you buy Vocaloid 3 separately? Also, have you ever checked out this product, http://realitone.com/blue/ ? I was thinking of buying it during their sale.
Originally Posted By: JohnJohnJohn
David, thanks for that info...Avanna is very interesting! Do you know if it is limited unless you buy Vocaloid 3 separately? Also, have you ever checked out this product, http://realitone.com/blue/ ? I was thinking of buying it during their sale.

Avanna comes with the free "tiny" editor by default. There are two limitations with the "tiny" editor:
  • You can only work with one vocal track per track, and
  • You only get 18 bars per session

Since I work in a DAW, it's not really a problem for me to assemble the parts from pieces, and layer the tracks. You can also import a single audio track, so you can play the backing tracks as you're fiddling with the vocals.

I've not heard of Realivox. Props to the video demo - they do a good job explaining what the product can't do. Vocaloid doesn't have those limitations - it automatically anticipates the consonants. Plus, they state: "Finally, your search for a library that says "oosht" is over!" laugh

The legato from pitch to pitch isn't also as nice as Vocaloid. On the other hand, Realivox looks pretty fun to play.

So if you're looking for a nice background vocal, Realivox is probably a better choice than Vocaloid, as long as you're aware of the tool's limitation.
I'd like to steer the conversation back to the voicebank project.
I posted it among my HS Classmates and have been getting a lot of interest. I posted VocalID's promo today.

Sorry if I hijacked your thread, it was not intentional.
Don, no hi-jack.

If you, the reader are just arriving at this thread at this post, please hit the first post links and consider donating your voice to the voice-bank.

If nothing else, watch the TED talk video: http://www.ted.com/talks/rupal_patel_synthetic_voices_as_unique_as_fingerprints?language=en

See if that doesn't inspire you...
I recorded my first 500 sentences tonight. Note, make yourself comfortable so that you can read the sentences and read-aloud while recording without much head movement. However, don't put the mic right in front of your monitor. If you do, there will be some notch filtering that can occur. The recording 'studio' that is part of the voicebank is quite simple to use. Note to the interested, it's no small task to record 500 sentences - it took me about an hour and a half to get it done. The overall amount is roughly 3500 sentences! So, I'm looking at probably 8 hours of time that I'll be putting in to this effort. I will say that the first 150 sentences or so took longer because of the way I was oriented and had to turn away from the monitor for each sentence. The next 350 went by much faster.
© PG Music Forums