For those following the progress of this project, I'd hoped that I'd have a final version of my synthetic singer program completed back in March. However, it had a bad case of the mumbles, and I headed back to the drawing board to rewrite it. It's been eating my spare time since then.
I spent far too much time trying to get synthesized plosives and fricatives working. Then I listened to earlier versions of the program, and realized how much better they sounded using samples instead. So all that work got tossed out, and I ended up writing the code from scratch... again.
Just the other day, I got the code to where it is again reading MusicXML files and generating .wav files. There's still a lot of work to do, but the end is (hopefully) in sight.
I've been using Twinkle, Twinkle, Little Star" as my demo song, and here's the most current version, warts and all. For example, I haven't yet figured out why it can't say "world" correctly.
As a change of pace, I decided to do my own version of Daisy Bell, one of the first examples of computer generated singing. It's still missing some phonemes, so I've cheated in spots. For example, the /G/ is actually /D SH/. The lyrics were automatically converted to phonemes, but I did some replacement by hand because the allophonic replacement code isn't working yet.
I also added some compression and reverb because everything sounds better with reverb. Just to let you know, it doesn't sound quite this good out of the box... But it sounds exactly as bad .
Anyway, here's synSinger singing "Daisy", as well as the 1961 version by Max Mathews, John Kelly, and Carol Lochbaum, which I found on Perry Cook's website. To make comparison easier, they're in the same key and tempo:
I've always been curious about what software was used to generate the computer performance 55 years ago. It turns out that the data was hand-coded into the computer. That explains why I was never able to find any references to the text-to-singing program... It never existed!
As always, comments (positive or negative) are always appreciated.
I'll confess - that's one thing I hadn't considered.
But that reminds me that I do need to add another feature - using another sound source instead of a glottal pitch. Remember the cellos voice on the MacInTalk?
I've never heard that original recording, absolutely insane that that was 1961... I also can't believe that you coded this yourself, that's crazy!! Amazing work. Your version is sounding very good. Keep us updated on how you progress with "world"!
Wow. Amazing you coded that yourself. I can't even begin grasping how you did that. Like how do you pick the right sample for a specific part of a word, or did you sample sounds and complete words? Individual letters can sound different when used in combination with other letters...
There are a couple different ways that vocal synthesis can be approached. The method that I'm using is called "formant synthesis", and is one of the oldest techniques that's been used for computer synthesis.
In English, there are approximately 40 distinct "sounds" that make up the language, are referred to as "phonemes".
There are different phonetic systems, but one of the simplest for American English is the "Arpabet", which uses plain text characters to represent phonemes. For example, the word "dictionary" would be written:
I use the CMU Dictionary to convert English into phonemes. If a word isn't found in the dictionary, I fall back to a public domain program called "Reciter" which guesses how to pronounce the word.
Phonemes are turned into sound by simulating the human vocal tract electronically. Before explaining that, let give give a (very simplified) explanation of how we create vocal sounds.
As air passes through the glottal folds, the folds vibrate and create sound. By controlling the tension (which in turn controls the length of the folds' opening), we can raise and lower the pitch we create. This pitch is called the fundamental frequency (F0), which we hear at the pitch of the voice.
This pitched glottal pulse (which resembled a kazoo sound) passes through our mouth. We use our tongue to create one or more resonating chambers that reinforce specific frequencies in the glottal pitch. These reinforced frequencies are called "resonances", and are what distinguishes one phonemes from another.
For example, (borrowing from the SoftVoice website), here are a number of vowel sounds, and the frequencies of resonance for the "average" male speaker in Hz:
In the phoneme /IY/ (as in beet), the first formant (F1) is at 270Hz, the second (F2) is at 2300Hz, and the third (F3) is at 3000Hz. Again, these formants don't alter the fundamental pitch, and remain fixed no matter what pitch you're singing.
Some phonemes are obviously more complex than that. For example, the phonemes /IY/ and /UW/ are diphthongs, and consist of two distinct targets. But I'm digressing...
To do this electronically, I generate a waveform that approximates a glottal pulse at the desired pitch, and pass it through a series of bandpass filters - one for each formant frequeny - to resonate at the desired frequencies. The output is a rough approximation of the sound.
Changing the pitch of the glottal pulse changes the pitch that's being sung. Changing the resonating filters to new values changes the phoneme that's being sung.
Some sounds (like the frication in the /F/ or the plosive in the /T/) are created by means different than described above. I used to synthesize them, but I now use digital samples because they give better results.
Very little work that I've done are my own ideas. I've borrowed heavily from the published work of Dennis Klatt, who wrote one of the first text-to-speech computer programs.
If you're curious, I'd highly recommend downloading this Formant Synthesis Demo. Click and drag in the area marked F1/F2 (formants one and two) and you'll get a good idea how this works.
Thanks for you answer, David. It did bring me more understanding of the subject of vocal synthesis. I will visit the sites you mentioned. Please keep us informed about your progress with this project. I find it very interesting.
The various parameters of vibrato - depth, speed, minimum note length and delay before start - can be specified.
I've also worked on clearing up a number of phonemes, including the /ERL/ in "world", although the main problem with that word is the trailing /D/.
This version of synSinger is written in Lua, and isn't particularly fast - it renders audio about about half the speed of the song. I still haven't been able to spend time to figure out how to create a stand-alone executable. I need to spend some time with the squish documentation.
There are still instances where it will "squelch" when parameters change to quickly, and some of the phonemes still need more attention. But for the most part, the output seems to be fairly acceptable, although not always intelligible. Truth be told, it's only incrementally better than prior versions.
I've also created a "female" voice for synSinger by mapping phonemes formants from average male phoneme space to average female phoneme space. It also modifies some other parameters, such as raising the pitch up an octave (so "she" doesn't sound like a chain smoker), adding more breath noise, altering the glottal pulse, and modifying the formant bandwidths based on a shorter larynx. But it still sounds a bit cheesy, because female voices aren't something formant synthesis does that well:
We’ve expanded the Band-in-a-Box® RealTracks library with 202 incredible new RealTracks (in sets 449-467) across Jazz, Blues, Funk, World, Pop, Rock, Country, Americana, and Praise & Worship—featuring your most requested styles!
Jazz, Blues & World (Sets 449–455):
These RealTracks includes “Soul Jazz” with Neil Swainson (bass), Mike Clark (drums), Charles Treadway (organ), Miles Black (piano), and Brent Mason (guitar). Enjoy “Requested ’60s” jazz, classic acoustic blues with Colin Linden, and more of our popular 2-handed piano soloing. Plus, a RealTracks first—Tango with bandoneon, recorded in Argentina!
Rock & Pop (Sets 456–461):
This collection includes Disco, slap bass ‘70s/‘80s pop, modern and ‘80s metal with Andy Wood, and a unique “Songwriter Potpourri” featuring Chinese folk instruments, piano, banjo, and more. You’ll also find a muted electric guitar style (a RealTracks first!) and “Producer Layered Guitar” styles for slick "produced" sound.
Country, Americana & Praise (Sets 462–467):
We’ve added new RealTracks across bro country, Americana, praise & worship, vintage country, and songwriter piano. Highlights include Brent Mason (electric guitar), Eddie Bayers (drums), Doug Jernigan (pedal steel), John Jarvis (piano), Glen Duncan (banjo, mandolin & fiddle), Mike Harrison (electric bass) and more—offering everything from modern sounds to heartfelt Americana styles
And, if you are looking for more, the 2025 49-PAK (for $49) includes an additional 20 RealTracks with exciting new sounds and genre-spanning styles. Enjoy RealTracks firsts like Chinese instruments (guzheng & dizi), the bandoneon in an authentic Argentine tango trio, and the classic “tic-tac” baritone guitar for vintage country.
You’ll also get slick ’80s metal guitar from Andy Wood, modern metal with guitarist Nico Santora, bass player Nick Schendzielos, and drummer Aaron Stechauner, more praise & worship, indie-folk, modern/bro country with Brent Mason, and “Songwriter Americana” with Johnny Hiland.
Plus, enjoy user-requested styles like Soul Jazz RealDrums, fast Celtic Strathspey guitar, and Chill Hop piano & drums!
With your version 2025 for Mac Pro, MegaPAK, UltraPAK, UltraPAK+, Audiophile Edition or PlusPAK purchase, we'll include a Bonus PAK full of great new Add-ons FREE! Or upgrade to the 2025 49-PAK for only $49 to receive even more NEW Add-ons including 20 additional RealTracks!
These PAKs are loaded with additional add-ons to supercharge your Band-in-a-Box®!
This Free Bonus PAK includes:
The 2025 RealCombos Booster PAK:
-For Pro customers, this includes 33 new RealTracks and 65+ new RealStyles.
-For MegaPAK customers, this includes 29 new RealTracks and 45+ new RealStyles.
-For UltraPAK customers, this includes 20 new RealStyles.
Look Ma! More MIDI 13: Country & Americana
Instrumental Studies Set 22: 2-Hand Piano Soloing - Rhythm Changes
MIDI SuperTracks Set 44: Jazz Piano
Artist Performance Set 17: Songs with Vocals 7
Playable RealTracks Set 4
RealDrums Stems Set 7: Jazz with Mike Clark
SynthMaster Sounds and Styles (with audio demos)
128 GM MIDI Patch Audio Demos.
Looking for more great add-ons, then upgrade to the 2025 49-PAK for just $49 and you'll get:
20 Bonus Unreleased RealTracks and RealDrums with 20 RealStyles,
FLAC Files (lossless audio files) for the 20 Bonus Unreleased RealTracks and RealDrums
Look Ma! More MIDI 14: SynthMaster,
Instrumental Studies Set 23: More '80s Hard Rock Soloing,
MIDI SuperTracks Set 45: More SynthMaster
Artist Performance Set 18: Songs with Vocals 8
RealDrums Stems Set 8: Pop, Funk & More with Jerry Roe
New! Xtra Styles PAK 20 for Band-in-a-Box 2025 and Higher for Mac!
Xtra Styles PAK 20 for Mac & Windows Band-in-a-Box version 2025 (and higher) is here with 200 brand new RealStyles!
We're excited to bring you our latest and greatest in the all new Xtra Styles PAK 20 for Band-in-a-Box! This fresh installment is packed with 200 all-new styles spanning the rock & pop, jazz, and country genres you've come to expect, as well as the exciting inclusion of electronic styles!
In this PAK you’ll discover: Minimalist Modern Funk, New Wave Synth Pop, Hard Bop Latin Groove, Gospel Country Shuffle, Cinematic Synthwave, '60s Motown, Funky Lo-Fi Bossa, Heavy 1980s Metal, Soft Muted 12-8 Folk, J-Pop Jazz Fusion, and many more!
All the Xtra Styles PAKs 1 - 20 are on special for only $29 each (reg $49), or get all 209 PAKs for $199 (reg $399)! Order now!
Note: The Xtra Styles require the UltraPAK, UltraPAK+, or Audiophile Edition of Band-in-a-Box®. (Xtra Styles PAK 20 requires the 2025 or higher UltraPAK, UltraPAK+, or Audiophile Edition. They will not work with the Pro or MegaPAK version because they need the RealTracks from the UltraPAK, UltraPAK+, or Audiophile Edition.
New! XPro Styles PAK 9 for Band-in-a-Box 2025 and higher for Mac!
We've just released XPro Styles PAK 9 for Mac & Windows Band-in-a-Box version 2025 (and higher) with 100 brand new RealStyles, plus 29 RealTracks/RealDrums!
We've been hard at it to bring you the latest and greatest in this 9th installment of our popular XPro Styles PAK series! Included are 75 styles spanning the rock & pop, jazz, and country genres (25 styles each) that fans have come to expect, as well as 25 styles in this volume's wildcard genre: funk & R&B!
If you're itching to get a sneak peek at what's included in XPro Styles PAK 9, here is a small helping of what you can look forward to: Funky R&B Horns, Upbeat Celtic Rock, Jazz Fusion Salsa, Gentle Indie Folk, Cool '60s Soul, Funky '70s R&B, Smooth Jazz Hip Hop, Acoustic Rockabilly Swing, Funky Reggae Dub, Dreamy Retro Latin Jazz, Retro Soul-Rock Fusion, and much more!
Special Pricing! Until July 31, 2024, all the XPro Styles PAKs 1 - 9 are on sale for only $29 ea (Reg. $49 ea), or get them all in the XPro Styles PAK Bundle for only $149 (reg. $299)! Order now!
XPro Styles PAKs require Band-in-a-Box® 2025 or higher and are compatible with ANY package, including the Pro, MegaPAK, UltraPAK, UltraPAK+, and Audiophile Edition.
New! Xtra Styles PAK 20 for Band-in-a-Box 2025 and Higher for Windows!
Xtra Styles PAK 20 for Windows & Mac Band-in-a-Box version 2025 (and higher) is here with 200 brand new RealStyles!
We're excited to bring you our latest and greatest in the all new Xtra Styles PAK 20 for Band-in-a-Box! This fresh installment is packed with 200 all-new styles spanning the rock & pop, jazz, and country genres you've come to expect, as well as the exciting inclusion of electronic styles!
In this PAK you’ll discover: Minimalist Modern Funk, New Wave Synth Pop, Hard Bop Latin Groove, Gospel Country Shuffle, Cinematic Synthwave, '60s Motown, Funky Lo-Fi Bossa, Heavy 1980s Metal, Soft Muted 12-8 Folk, J-Pop Jazz Fusion, and many more!
All the Xtra Styles PAKs 1 - 20 are on special for only $29 each (reg $49), or get all 209 PAKs for $199 (reg $399)! Order now!
Note: The Xtra Styles require the UltraPAK, UltraPAK+, or Audiophile Edition of Band-in-a-Box®. (Xtra Styles PAK 20 requires the 2025 or higher UltraPAK, UltraPAK+, or Audiophile Edition. They will not work with the Pro or MegaPAK version because they need the RealTracks from the UltraPAK, UltraPAK+, or Audiophile Edition.
New! XPro Styles PAK 9 for Band-in-a-Box 2025 and higher for Windows!
We've just released XPro Styles PAK 9 for Windows & Mac Band-in-a-Box version 2025 (and higher) with 100 brand new RealStyles, plus 29 RealTracks/RealDrums!
We've been hard at it to bring you the latest and greatest in this 9th installment of our popular XPro Styles PAK series! Included are 75 styles spanning the rock & pop, jazz, and country genres (25 styles each) that fans have come to expect, as well as 25 styles in this volume's wildcard genre: funk & R&B!
If you're itching to get a sneak peek at what's included in XPro Styles PAK 9, here is a small helping of what you can look forward to: Funky R&B Horns, Upbeat Celtic Rock, Jazz Fusion Salsa, Gentle Indie Folk, Cool '60s Soul, Funky '70s R&B, Smooth Jazz Hip Hop, Acoustic Rockabilly Swing, Funky Reggae Dub, Dreamy Retro Latin Jazz, Retro Soul-Rock Fusion, and much more!
Special Pricing! Until July 31, 2024, all the XPro Styles PAKs 1 - 9 are on sale for only $29 ea (Reg. $49 ea), or get them all in the XPro Styles PAK Bundle for only $149 (reg. $299)! Order now!
XPro Styles PAKs require Band-in-a-Box® 2025 or higher and are compatible with ANY package, including the Pro, MegaPAK, UltraPAK, UltraPAK+, and Audiophile Edition.
Video: Band-in-a-Box® 2025 for Mac®: VST3 Plugin Support
Band-in-a-Box® 2025 for Mac® now includes support for VST3 plugins, alongside VST and AU. Use them with MIDI or audio tracks for even more creative possibilities in your music production.
Band-in-a-Box® 2025 for Macs®: VST3 Plugin Support
Video: Band-in-a-Box® 2025 for Mac®: Using VST3 Plugins
One of our representatives will be happy to help you over the phone. Our hours of operation are from
6:00AM to 6:00PM PST (GMT -8) Monday thru Friday, and 8:00AM to 4:00PM PST Saturday. We are closed Sunday. You can also send us your questions via email.
One of our representatives will be happy to help you on our Live Chat or by email. Our hours of operation are from
6:00AM to 6:00PM PST (GMT -8) Monday thru Friday; 8:00AM to 4:00PM PST (GMT -8) Saturday; Closed Sunday.