What is Neural Machine Translation & How does it work?

Published on 18 Apr 11:10 by Sam Yip
Tags: translation   AI   machine translation   tools  

TranslateFX What is Neural Machine Translation & How does it work?

Neural machine translation (NMT) is typically software used to translate words from one language to another. Google Translate, Baidu Translate are well-known examples of NMT offered to the public via the Internet.

The reason this NMT is important is because recent advancements in the technology have allowed an increasing number of multinational institutions to adopt NMT engines to aid in internal and external communications.

As you can see in the chart above, neural machine translation technology is currently state-of-the-art technology in machine translation and offers the highest quality translation. Google Translate started using the technology to power its service in 2016 after having used statistical methods for many years.

How Does It Work?

Technically, NMTs encompass all types of machine translation where an artificial neural network is used to predict a sequence of numbers when provided with a sequence of numbers. In the case of translation, each word in the input sentence (e.g English) is encoded as a number to be translated by the neural network into a resulting sequence of numbers representing the translated target sentence (e.g Chinese).

To give you a simplified example of an English to Chinese machine translation:

"I am a dog" is encoded into numbers 251, 3245, 953, 2

The numbers 251, 3245, 953, 2 are input into a neural translation model and results in output 2241, 9242, 98, 6342

2241, 9242, 98, 6342 is then decoded into the Chinese translation “我是只狗"

(each number in the input and output represents a word in the English and Chinese dictionary and are always encoded and decoded accordingly)

The above example then begs a further question 'How does the translation model work?' The simple answer is it is via a complex mathematical formula (represented as a neural network). As described earlier, this formula takes in a string of numbers as inputs and outputs a resulting string of numbers. The parameters of this neural network are created and refined via training the network with millions of sentence pairs (e.g English and Chinese sentence pair translations). Each sentence pair modifies the neural network slightly as it runs through each sentence pair using an algorithm called back-propagation. This results in a best-fit model most accurately translating any of the input numbers into outputs numbers from the millions of sentence pairs it was provided.

Why Neural Networks?

Simply put, neural networks permit complexity. Neural networks can have a large number of parameters with weights and biases between nodes to give them the flexibility to fit highly complex data and train complex models. Model complexity allows the model to generalize to the large volumes of examples it’s trained on, such as digesting millions of language pairs.

One way to understand neural networks is to think of the input as a signal with "information" in it. Some of this information is relevant for the task at hand (e.g predicting the output). Think of the output as also a signal with a certain amount of "information". The neural network attempts to "repeatedly refine" and compress the input signal's information to match the desired output signal. Think of each hidden layer of the network as stripping out unimportant parts of the input information, and keeping and/or transforming the output information along the way through the network. The fully-connected neural network will transform the input information into a form in the final hidden layer, such that it is separable by the output layer.

Use Cases of NMT

NMT technology can be applied to any language pair including languages that are brand new or understood by few. They can also be fine-tuned accordingly to particular styles and types of a languages (e.g casual, formal, US-English, UK-English, scientific, medical, financial, etc..). NMTs are primarily dependent on the training data used to train the neural network as it learns to mimic the data it has been trained with. Many high-accuracy industry specific and custom developed machine translation (MT) models still incorporate both neural and statistical methods today to squeeze the best performance for our clients (including the ones we develop at TranslateFX).

Lasting Remarks

One important disadvantage worth mentioning regarding NMT is consistency. Since neural networks are trained to learn upon volumes of data, they are not easy to tame and control. It may translate a particular team 'restaurant' as ‘菜馆‘ in one scenario and as ’餐馆‘ in another scenario (both correct Chinese translations). This becomes particularly troublesome when it comes to translating names (people or companies). At TranslateFX, we had to develop additional neural models to overcome such faults of NMT.

It is also important to mention that while the benefits of NMT have made it an indispensable part of translation management and workflow, NMTs still require human translators to review and post-edit of machine translation between languages. In fact, we believe humans will always play a critical step in machine translation to review the accuracy and context of machine. I'll further explore this topic next time as it worth a post of its own.

Key Takeaway to Share:

Top Posts

Recent Posts


translation   equity   research   reporting   finance   localization   legal   industry   China   AI   growth   management   machine translation   hiring   compliance   tools  

About Us

TranslateFX develops AI translation technology specifically for financial and legal institutions. The company develops AI models and workflow tools for clients of all sizes. We believe humans always play and important part of the process and our tools reduce the time and costs of translation by 60% or more.

Contact Us

Other Websites