DeepL is exploring the “subsequent frontier” of AI translation with DeepL Voice

German tech darling DeepL has (finally) launched a voice-to-text service. It's called DeepL Voice and it converts audio from live or video conversations into translated text.

DeepL users can now listen to people speaking a language they don't understand and automatically translate them into a language they understand – in real time. The new function currently supports English, German, Japanese, Korean, Swedish, Dutch, French, Turkish, Polish, Portuguese, Russian, Spanish and Italian.

What's exciting about the launch of DeepL Voice is that it runs on the same neural networks as the company's text-to-text offering that offers it Claims is that “World’s Best” AI Translator.

As someone who just moved to a foreign country, I'm dying to try out a voice-to-text translator that might actually work. All the ones I've tried so far don't work in real time – there's a lag that makes them pretty unusable – and the translation quality is pretty bad.

For personal conversations, you can start DeepL Voice on your cell phone and place it between you and the other speaker. Your conversation will then be displayed so each person can easily follow the translations on one device.

The 💜 of EU technology

The latest rumors from the EU tech scene, a story from our wise founder Boris and questionable AI art. It's free in your inbox every week. Register now!

You can also integrate DeepL Voice into Microsoft Teams and video conferencing across language barriers. The translated text appears in a sidebar as a subtitle. It remains to be seen whether DeepL Voice will soon be available on platforms such as Zoom or Google Meet.

“The Next Frontier”

Although this is DeepL's first such offering, it likely won't be the last. Founder and CEO of DeepL, Jarek Kutylowski called Real-time language translation The ““Next Frontier” for the company.

“DeepL is already a leader in written translation, but real-time voice translation is a completely different story,” said DeepL’s founder and CEO. Jarek Kutylowski.

“When translating language, you deal with incomplete input, pronunciation issues, latency, and more, which can lead to inaccurate translations and a poor user experience.

“So “We have developed a solution that takes these aspects into account from the start and enables companies to reduce language barriers by allowing them to communicate in multiple languages ​​as needed,” he said Kutylovsky.

Quality will likely be DeepL Voice's differentiator over the countless other voice-to-text translation providers.

From a technological perspective, DeepL's success lies in the architecture of its neural networks, the input of human editors and the training data. But Kutylowski also believes it has one key advantage over its competitors: concentration.

“Focus is always an important thing,” Kutylowski previously told TNW. “Translating isn’t Google’s core business – it’s one of the 100 side jobs. The same is true if you consider the LLMs and the OpenAIs of the world as our competition; Translation is just one thing of what they do, and their GPU does a lot of different things. We focus on a specific area.”

In May, DeepL reached a valuation of $2 billion after securing a new investment of $300 million (€277 million). It covers 32 languages ​​and counts over 100,000 business users.

Comments are closed.