Machine translation: everything you need to know

Machine translation is becoming more and more popular, it is used as a predilect way to complement human translation processes. It can be understood as a form of computational linguistics and language engineering. It is an automated process, carried out by a computer, not a human, where the software translates the text or speech from one language to another.

How does machine translation work?

Machine translation is the process in which words are mechanically substituted from one language into another one by the use of translation software. It can be described as a two-step procedure: first, the meaning of the source text is decoded, and then it is recoded in the target language. However, this is not as simple as it sounds. Making words, phrases and whole sentences comprehensive in the target language, takes more than just replacing the words. Not all words have an equivalent word in the target language, and many of them have more than one meaning.

For this reason, the software used for machine translation is being constantly improved and updated, so translations become as accurate as possible. As an example, corpus statistical and neural techniques are being implemented to ensure improved output quality.

Machine translation can be trained by human intervention. A concept that is growing in popularity is “Adaptive Machine Translation”. Adaptive MT is a new technology that allows MT systems to learn from correction input, almost immediately. In this case, training is not required, because the system learns instantly from thepost-editor’s corrections. This is how it works: when you are translating while connected with the machine translator system, that system can learn specific ways of interpreting things. This improves the quality of the results.

Types of machine translation

The two most common machine translation engines are Rule based and Statistical. They differ in the way they process and analyze content, but they are often combined in the same system, known as Hybrid MT.

According to Hutchins, “The dominant framework until the late 1980’s was what is now known as ‘rule-based’ machine translation (RBMT). Since then, research has been dominated by corpus-based approaches, among which the primary distinction is made between, on the one hand, statistical machine translation (SMT), based primarily on word frequency and word combinations, and on the other hand, example-based machine translation (EBMT), based on the extraction and combination of phrases (or other short parts of texts).

1. Rule based machine translation (RBMT)

Rule based machine translation (RBMT) uses linguistic information about source and target languages. It is used most commonly for the creation of dictionaries and grammar programs. RBMT uses linguist rules to break down the content and produces more predictable output for terminology. Rule based engines don’t require a bilingual corpus to create the translation system.

2.Statistical machine translation (SMT)

Statistical machine translation (SMT) generates translations using statistical methods and models based on bilingual text corpora, such as the Canadian Hansard corpus or the record of the European Parliament. Statistical engines don´t analyze text based on language rules. Rather, they are built by analyzing bilingual corpus. This method requires a large volume of bilingual content.

3. Example-based machine translation

Example-based machine translation is a method of machine translation that uses a bilingual corpus with parallel texts as its main knowledge base at run-time. It is essentially a translation by analogy and can be viewed as an implementation of a case-based reasoning approach to machine learning. According to Hutchins, in its original conception EBMT seems to have been regarded primarily as a means of overcoming the deficiencies of RBMT systems, namely their weaknesses when translating between languages of greatly differing structures, such as English and Japanese.

4. Hybrid Machine translation

Hybrid machine translation is a method of machine translation that uses multiple machine translation approaches within a single machine translation system. It combines both rules and statistics. It has two approaches: Rules post-processed by statistics or statistics guided by rules. According to H. W. Xuan et al. (2012), “Hybrid machine translation (HMT) integrates the core of existing methods, including rule-based MT, statistical-based MT, and example-based MT, which makes up the deficiencies of individual MT method.”

Why machine translation matters?

Between 2016 and 2017, more data was created than in the whole of human history prior. Since 2013, companies like Google and Microsoft started testing and developing the use of neural networks. Neural networks are statistical learning models, that were first used in speech and image recognition technology. Neural networks enable MT engines to train themselves, through a process similar to the way the brain works: through trial and error (which is called deep learning).

So, in conclusion, machine translation is growing and big companies counting on its perfection. So, do we need machine translation? Of course, as translators this can sound a bit odd. Human translations can’t and mustn’t be replaced, because we are humans and we use language to approach each other in ways machines cannot understand. For example, emotions can’t be interpreted by machines. Regardless, MT can be particularly useful to support human translation and to accelerate processes.

References

https://en.wikipedia.org/wiki/Machine_translation

https://d1wqtxts1xzle7.cloudfront.net/5673640/ebmt2_proceedings.pdf?response-content-disposition=inline%3B+filename%3DMonolingual_Corpus_based_MT_using_chunks.pdf&Expires=1598488532&Signature=QgZ8R9nlZx6T2VGZbMDsgKuktcPHoQ5DOHnzG0H-QbzBhQaqqtbvn~YWvfAwkpSbUs2TQgHmVmUcMLl-2o0bEj1ajWxD-E~bXkv0RX4-vYhbg-~CKujHVZXqmhEXl5~MocAafHybHL9CU4zeotV9J94AjZjB~IR3Ah5pYiPF7L~CA~iW3KbGHz~WW3ZGi~DUHYkV3QhhDrNpfUuuLZvadfZoyk4O6ZBq4y8Pn6DePtZs7qbH8QF10DyXpzAka4iVrF8eD24GJGtB7ffxKm2Vdbeq-aDu~OnTRL1GZUMLrN0bxYzTMVRGX~YC-iJS-C3Q1OfPlT5bitl30mZwRiyVMA__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA#page=71

https://en.wikipedia.org/wiki/Example-based_machine_translation

https://reader.elsevier.com/reader/sd/pii/S1877705812004420?token=B9820AB956C2ADF7BD3CFCA5B5C2CA21FCAF9398E98BFA32164354CF81F3917EC12B6E58CE340442E6402003D3A7E112

go back