We can all agree that assistive technology is great. It gets even better when the contrivance involves not just software or hardware innovation, but an exchange between those needing assistance and people capable of providing it. Like, for example, the app using which you can “lend” your eyes to blind individuals over video chat. In a similar vein, VocaliD has created technology that gives speech-impaired individuals a synthesized voice that sounds similar to their own original voice.
VocalID founder and speech scientist Rupal Patel explains in her TED talk that voice is as unique to individuals as their fingerprint. Unfortunately, people who lose their faculty of speech to injury or sickness are forced to choose from a limited range of synthesized voices. The lack of individuation, although convenient and scalable, isn’t ideal–primarily because a person’s voice is closely linked with their personality. To give a large group of people who vary in gender, age, and body type the same voice takes away from their sense of individuality.
To make amends, Patel, along with speech synthesis expert Timothy Bunnell, came up with a way for each individual to receive a bespoke voice that closely mimics their own. Using VocalID as a platform, the team crowdsources voices from people around the world. To obtain samples, the software leads voice donors through recording sessions during which they’re asked to read aloud through particular reading materials.
The process doesn’t require the enunciation of every sound or word that an individual would require in their daily speech. Instead, the software uses a couple of hours worth of recordings to piece together a profile for each voice. It is similar to how Siri can say practically anything in the English language without the person whose voice that is having to record every phrase or sentence. Similarly, VocalID takes a few sentences and extrapolates what all other sentences would sound like based on what it learns from the available dataset.
All the voices collected from around the world are collated into what’s called a Voicebank. Currently, more than 18,000 people from 110 countries lend to the Voicebank, so there’s a large variety of speech types from which to choose. Each time the need arises, a donor whose voice is similar to that of the recipient is handpicked. The actual delivery of the voice is carried out by devices made by manufacturers such as Tobii Dynavox, with whom VocalID are in partnership.
But wait a second. How does VocalID figure out what a recipient’s voice would sound like if they are speech impaired? To understand that, we must first take a little lesson in speech theory. Speech is produced in two stages. In the ‘source’ stage, the actual sound is generated. It is then filtered by the vocal tract’s resonant properties in the ‘filter’ stage.
Individuals with speech-related disabilities retain the ability to modulate the ‘source’ stage sounds. Using these residual utterances, VocalID is capable of figuring out what a person’s actual voice would sound like. It is through this understanding that a donor is picked from the Voicebank.
VocalID’s crowdsourced voice to custom voice operations is currently underway. You can sign up to donate yours on the website.
- Apple September 2018 Event: iPhone XS, XS Max, and More - September 13, 2018
- Apple September 2018 Event: What’s in Store - September 11, 2018
- E3 2018 – The Highlights - June 15, 2018
Brian Mitchell says
I am sure that kids that are born without the ability to speak would go ape shit crazy for something like this. Is there way that you could make them sound like Spongebob or another favorite cartoon?
Donald Willoughby says
Just like giving someone the gift of hearing, here you can give someone a voice when they thought they would never have one!
Irma Janus says
I am impressed with how the program picks “your” voice. There is so much more that the AI can do for us, I just wish people would stop freaking out about it.
Pedro Molina says
A problem that impacts millions and now a solution. I think this is the kind of thing that goes well with AI. Maybe the voice can learn how to sound as the speech is happening and over time. donating to the company is an easy choice.
Mary Hall says
I also like the idea of donating your own voice. It is like donating blood for a good cause, right?
Thomas Helsley says
Yes. it is for a good cause. i would be happy to donate my voice as well.