DeepL AI Labs

How do you show off real-time voice translation to potential customers? 

One route is to use actors in ads that show the idealized image of how it should work. You use prepared scripts of people “using” your solution, and pre-recorded translated versions perfectly laid over the top. At DeepL, we were never going to do it this way.

We didn’t want to tell people stories about what real-time voice translation could be like. We wanted to show them what it’s actually like: what it’s like to use, in real life and real time. We wanted anyone to be able to experience DeepL Voice for themselves.

For that, you need a demo that anybody can use from any device. All they should need is a mobile phone or a laptop, and a link. You need to turn a complex, integrated product for businesses into an easy-to-use, highly accessible demo that takes just seconds to try.

We built that demo using DeepL Voice API. It didn’t just deliver an “Aha!” experience of real-time voice translation for anyone with a laptop or a mobile phone. It also proved just how easy and impactful it is to build with Voice API. And it’s given us a playground for building and developing new, breakthrough voice features that’s ideally suited to DeepL’s Hack Friday culture.

Building a working demo in under a week with DeepL Voice API

Building the Voice Demo was a challenge that we set for Software Engineer Santhos Ramalingam when he was just a few weeks into his time at DeepL. 

We asked Santhos to optimize for “time to Aha!” – to create an experience that enables people to experience DeepL Voice for themselves, with all friction removed. They don’t need to talk to sales. They don’t need to buy a license. They don’t need to set up our integrations for Zoom Meetings or Microsoft Teams.

For a task like this you need an API – but not just any API. As Santhos explains, a real-time system brings particular challenges, beyond the scope of standard request and response API calls. It has to be thought out and designed for robustness, and the scale, reliability and availability required.

DeepL Voice API delivers on all of these things. It’s carefully designed. It’s robust. It’s well documented. Once Santhos had scoped out the Voice API’s capabilities, and taken the decision to build the demo on it, he was able to deliver a Proof of Concept in under a week. 

What’s it like to build a robust, working demo on DeepL Voice API? Here’s Santhos’s story:

The simplicity is the sell: Descoping for high-impact demo

Descoping was a vital enabler of the demo that Santhos built. Clarity on the experience he wanted to deliver enabled him to strip away extraneous features. That meant he could showcase the core engine of voice translation that really makes a difference for people. 

One characteristic that Santhos wanted to come across in the demo is the distinctive way that DeepL Voice balances accuracy and stability for real-time translation. You can see the translation engine thinking along with you: delivering a highly accurate initial translation and then making small adjustments as a sentence finishes. Refining for accuracy enhances the smoothness of the experience, rather than undermining it. You don’t just get a voice translation. You get a real-time voice translation, and a real sense of the speed and stability that helps DeepL Voice stand out.

A playground for new features

Santhos and the API team haven’t just built a demo. They’ve created a playground for our AI research, product and sales teams. We’re now using an alternative experimental version of his demo to showcase upcoming, new voice translation features and share them with our customers.

Here’s Santhos’s take on one of those experiments: voice cloning features that will help our breakthrough voice-to-voice translation experience to reflect the style and tone of how people speak, and express them authentically across languages.

Try out the new DeepL Voice demo, and learn more about what you can build with the DeepL API.

 

Partilhar