This AI cloned my voice with simply Three minutes of audio

There’s a scene in Mission Impossible 3 that you might remember. In it our hero. Ethan Hunt (Tom Cruise). tackles the film’s villain, holding him at gunpoint and forcing him to read a bizarre series of sentences aloud.

“What I enjoy most is the pleasure of Busby’s company,” he reads reluctantly. “He nailed Miss Yancy’s chair and she called him a horrible boy. At the end of the month he threw two kittens across the room…”

Although it sounds random and unimportant, it quickly becomes clear that the words he reads are not random at all – they were designed on purpose to help a software program clone his voice. Once he’s finished the passage, the software analyzes the audio and instantly gives Hunt the ability to speak and sound just like the villain – the final bit of his near-perfect disguise.

Mission: Impossible 3 (2006) – Seeing Double Scene (5/8) | film clip e

Now, if you take that scene and strip away all the espionage, guns, and dramatic tension, what remains is a pretty solid example of what I witnessed at CES today during a demo of My Own Voice, an AI-powered “voice banking.” Service from a French startup called Acapela Group.

The company’s raison d’être is to help people who eventually lose their language. This is typically something that happens as a result of injury, illness, or diseases like ALS, Huntington’s disease, and throat cancer. Whatever the cause, the company’s My Own Voice platform allows a person to synthetically clone their voice and preserve the unique tone, timbre, and personality that makes them their own – something they usually do lost in most text-to-speech software (think Stephen Hawking).

Well, to be fair, voice cloning technology isn’t necessarily new or technologically breakthrough at this point. Such services have been around for years, and thanks to the advent of deepfakes, there are currently dozens of other companies that can do the same thing as Acapela Group. But there are two big things that set My Own Voice apart from the rest of the pack: speed and determination.

Super-Fast AI Voice Cloning at CES #shorts

My Own Voice is impressively fast. Unlike other services that often require hours of reference audio to create a realistic-sounding clone, My Own Voice’s AI can produce an amazingly good synthesis after listening to just 50 short sentences, or about 3 minutes of recorded audio. It’s basically just like that Mission Impossible scene; They have developed a streamlined set of reference phrases to help their AI learn what you sound like. So instead of manually jotting down every word you can think of, all you have to do is speak through a handful of simple sentences.

Arguably more important than the software’s speed, however, is its purpose. Again, this technology is not particularly new or novel. There have been a handful of notable startups that have developed similar voice cloning technologies – such as Canadian startup Lyrebird or London-based company Sonantic. But both startups were quickly acquired, and their voice cloning technology was eventually used for AI overdubbing in films and video editing software.

That’s not to say that this isn’t a good use of voice cloning technology. They certainly are, and they’re probably pretty profitable as well – but that’s what makes My Own Voice so cool. It’s not often that you come across such powerful technology that wasn’t built for entertainment or productivity, but was specifically designed to help the underprivileged and literally give them a voice.

Today’s tech news, curated and condensed for your inbox

Check your inbox!

Please provide a valid email address to continue.

This email address is currently on file. If you don’t receive any newsletters, please check your spam folder.

Sorry, there was an error subscribing. Please try again later.

Editor’s Recommendations

Comments are closed.