How does DeepL work?

Key Takeaways

  • DeepL uses custom neural network architectures for neural machine translation, producing more natural, context-aware output than typical Transformer-based systems.
  • It trains on targeted, high-quality multilingual data collected by specialized crawlers rather than undifferentiated raw web crawl text.
  • Its training methods go beyond basic supervised learning, using additional machine learning techniques to improve accuracy and fluency.
  • DeepL runs billion-parameter neural networks but uses parameters efficiently to balance translation quality with speed and compute demands.
  • Smaller, faster DeepL models, including free versions, still deliver exceptional translation quality for everyday professional use.
  • DeepL's neural translation technology powers a growing suite of Language AI tools, including Translator, Write, and Voice, built for enterprise-scale multilingual workflows.
  • DeepL continues hiring experts in mathematics and neural networks to advance its translation and broader Language AI capabilities.

People often ask us how DeepL Translator works better than competing systems from major tech companies. It does for several reasons. 

Like most modern machine translation systems, DeepL Translator uses artificial neural networks to translate text. We train these networks on many millions of translated texts. However, our researchers have been able to improve our overall neural network methodology in four key areas.

Explore DeepL AI Labs to learn about our latest breakthroughs in Language AI.

1. Custom neural network architecture for better translation quality

Most publicly available machine translation systems are direct modifications of the Transformer architecture. Of course, DeepL’s neural networks also contain parts of this architecture, such as attention mechanisms.

That said, notable differences exist in the networks’ topology. These differences lead to a significant overall improvement in translation quality over the public research state of the art. 

We clearly see these differences in network architecture quality during our internal experiments. In each one, we train our architectures and top Transformer models on the same data and compare their performance.

These architectural advances give DeepL translations a more natural, context-aware quality than typical Transformer-based systems.

2. Targeted, high-quality training data instead of raw web crawl

Most of our direct competitors are major tech companies that have a history of many years developing web crawlers. As a result, they have a distinct advantage in the amount of available training data.

By contrast, we emphasize the targeted acquisition of special training data that helps our network achieve higher translation quality. That’s why we’ve developed, among other advances, special crawlers that automatically find translations on the internet and assess their quality.

This focus on high-quality multilingual training data strengthens our models’ context understanding and reduces common machine translation errors.

3. Advanced training methodology beyond basic supervised learning

In public research, most approaches train neural networks for translation using the “supervised learning” method.

This means they show the network different examples over and over again. The network repeatedly compares its own translations with the training data’s translations. If discrepancies occur, they adjust the network’s weights accordingly.

We use additional techniques from other machine learning areas when training DeepL neural networks. This also allows us to achieve distinct, remarkable improvements.

These methods help our models learn more robust language patterns and deliver more accurate, natural translations in real-world use.

4. Network size and efficiency with billion-parameter models

DeepL researchers (like our largest competitors) train our translation networks with billions upon billions of parameters. These networks are so large that training must run in a distributed fashion on massive dedicated compute clusters.

In our research, we prioritize using the network’s parameters as efficiently as possible. This is how we consistently achieve a similar translation quality even with smaller and faster networks. That’s why we can offer exceptional translation quality to users of our free service.

These advances in network size and efficiency let DeepL deliver fast, accurate translations to enterprise and free users alike.

Help advance DeepL’s neural translation technology

We’re always on the lookout for skilled mathematicians and computer scientists who’d like to drive development and improve DeepL products. You can help break down global language barriers every day and see how DeepL works firsthand. 

If you also have experience with mathematics and neural network training and find it fulfilling to work on products people worldwide use for free, explore DeepL careers.

Put DeepL's Language AI to work across your business 

DeepL's neural translation technology is the foundation of a Language AI suite built for global enterprises.

Translator breaks down language barriers across documents, websites, and business communications with the nuance global enterprises demand. 

Write refines business writing in real time, helping teams communicate across languages with clarity. Voice delivers real-time translated captions to meetings and conversations, so teams hear and understand every participant.

Contact Sales to see how DeepL's Language AI suite can power your multilingual workflows.

Share