The history of machine translation is a fairly turbulent story of boom and bust, broken promises and shattered dreams (and some spectacular successes). With the advent of Statistical Machine Translation (SMT), history has repeated itself with uncanny precision. I would say we are now between Phases 2 and 3 of the famous Gartner Hype Cycle (which, annoyingly, is not a cycle at all):
For the last 10 years or so, researchers have been telling us that SMT represented a revolution in machine translation; that there would be no more linguistics, no more rules, just data. “Give me enough data,” went their motto, “and we can work miracles!’. Well Google has enough data, but they still have their limitations. I tried some things out:
- Ich will diesen Satz übersetzen
Google says: I want to translate this sentence
Perfect! If only we could always write like this…
- Ich will nur dass diser Satz richtig übersetzt wird
Google says: I just want that images this sentence is translated correctly.
Should be: I just want this sentence to be translated correctly
This is not so great. It just doesn’t make sense – where did “images” come from? So what went wrong?
Well I made two simple mistakes. There should be a comma after “nur” and “diser” should be “dieser”. Let’s fix those issues and try again:
- Ich will nur, dass dieser Satz richtig übersetzt wird.
Google says: I only want that this sentence is translated correctly.
Much better, but sounds a bit funny to my (British) English ears. It’s not wrong, just a bit stilted.
Now let’s try something a bit more difficult (although by no means unusual):
- Ich möchte bitte den Satz übersetzen lassen
Google says: I would like to translate the sentence
Now this translation sounds good (if you don’t know German), but it’s actually *wrong*. It should be “I would like to have this sentence translated”. A subtle, but quite possibly technically significant, difference.
Now I am not Google-Translate-bashing here; SMT is a great technology; but it’s not magic. Errors in the input to these systems will always lead to unreliable results – yes, you still have to care about the quality of your source content.
You also, by the way, also still need to care about branding, compliance, and liability in your source content – these issues won’t not look after themselves by magic either.

@Andrew Bredenkamp: For the last 10 years or so, researchers have been telling us that SMT represented a revolution in machine translation; that there would be no more linguistics, no more rules, just data. “Give me enough data,” went their motto, “and we can work miracles!’. Well Google has enough data, but they still have their limitations.
Data alone is not sufficient. You need metadata about the context.
Data alone is not sufficient. GT is a pure statistical approach because F.Och is alergic to anything but statistics. You have done a short test on a general engine that has been designed for general understanding (Mission: “To organize the information of the word”). SMT is the core of an automated translation system, but a lot ofthe quality depends on the systems built around it, modules, pre- and post-processing. Controlling the input actually has little impact on the outcome, unless you are developing very customized systems and thus want the system to “think” and “expect”. It helps, but it is not a goal in itself. The system should be mature enough to deal with some uncontrolled or unexpected areas and still perform.
I see from your tests that GT gists well and it is free
[...] This post was mentioned on Twitter by Andrew Bredenkamp. Andrew Bredenkamp said: Why is IQ important for MT: http://bit.ly/awXiOg [...]
One of the biggest mistakes that can be made during the marketing process is not being properly familiar with the culture of the country that is desired for expansion. For instance, something seen as cool and even stylish in the home country of the company might actually be perceived as rude or vulgar in another country. It can be difficult to be sure, especially if you aren’t a native or haven’t spent large amounts of time there. That is why language translation services are crucial to this process.
It can be difficult to be sure, especially if you aren’t a native or haven’t spent large amounts of time there. That is why language translation services are crucial to this process.