From Uzbek To Klingon, The Machine Cracks The Code
from the statistical-machine-translation dept
Move over Babelfish, there's a new translation technology in town. The NY Times has discovered that in just the past few years there have been some fairly impressive advancements in statistical machine translation. Traditional machine translation systems involve a bilingual programmer who can help map the languages, but with statistical machine translation, you just feed the system identical texts from multiple languages and let the machine figure it out. It sounds like, for now, the technology works in some cases, and is probably most useful in developing fast translation systems that might miss more nuanced language issues. Some of those who believe in the traditional methods scoff at the idea that the statistical method will ever be useful for anything more than very basic translations. However, with the rate of improvement over the past few years, it wouldn't be surprising to see statistical machine translation systems improve even more in the near future.Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Reader Comments
Subscribe: RSS
View by: Time | Thread
Oh please
"If we can learn how to translate even Klingon into English, then most human languages are easy by comparison," he said.
Klingon was invented by English-speaking sci-fi fans, with their narrow imagination. It does not represent another culture.
The fact remains that languages have concepts that are not understood in other languages. As with the cliche about Inuits having 20 words for snow, the Japanese have about 20 ways to refer to "self", as well as nuanced honorifics that simply do not translate into English. American Sign Language with its parallel-speaking forms have no equivalents in written or spoken English. Even in its simple form, a literal translation of American sign language sentence would go something like "Past night movie me see wow me."
Since ASL is so conceptually different, a new writing system called Signwriting is also evolving.
http://www.at-links.gc.ca/guide/l3i-E.asp
[ link to this | view in thread ]
Re: Oh please
[ link to this | view in thread ]
Re: Oh please
[ link to this | view in thread ]
Nuances Don't Matter Much
Nuances obviously matter when translating literary fiction or popular culture artifacs. I doubt this translation engine would handle most manga well. But it should work at least as well as BabelFish and perhaps, eventually, much better.
[ link to this | view in thread ]
Re: Nuances Don't Matter Much
I've done some technical Japanese-to-English translation work before. The problem is in the level of ambiguity allowed in Japanese language, which can make an English translation of a scientific paper sound vague, contradictory, or stupid. Often, one has to talk to the author directly to nail down an exact translation, though he may not be happy with the way it's expressed.
>I doubt that in discussing object oriented programming, the 20 words for snow used by the inuits will be encountered.
Depends -- if someone writes an Inuit manual and decides to use the 20 words for "snow" to illutrate polymorphism, then an English speaker is in trouble.
>But it should work at least as well as BabelFish and perhaps, eventually, much better.
BabelFish won't be too hard to beat.
[ link to this | view in thread ]
20 words for snow
[ link to this | view in thread ]
Re: 20 words for snow
That sums up a street hustler's anti-intellectual attitude, but in the world of serious translation, such attitudes do not fly. Translations have important legal, scientific, and medical consequences. The wrong instructions given to a patient can kill him. The wrong wording for treaties can lead to wars. The wrong instructions for operating a crane can kill lots of people, and this has happened before.
>words that don't translate directly can be ported over into a reasonably flexible language; programs that can aid in translation may hasten the merging of tody's languages into a universal pan-human language.
People have tried this with Esperanto or other Klingon-league languages, but they have no real world use.
[ link to this | view in thread ]
Re: Nuances Don't Matter Much
Between the specialized kanji and the fscking loanwords, it's a pretty hopless...
Hell, even the movie "roadshowes" are the last on the planet to be released (of course that's probably just the Japanese movie industry cashing their monopoli in)... some of the alternative titles are funny though...
[ link to this | view in thread ]
Re: 20 words for snow
except for insulting newbies on-line...
...and letting the police try to find a translator who can read you your rights...
[ link to this | view in thread ]
Re: Nuances Don't Matter Much
Even technical terms have hundreds of meanings ("skhema" in Russian has about 3 dozen) all of which depend on subtle clues in the text. Even skilled humans, who are pre-wired for ambiguity, have difficulty sorting them out.
[ link to this | view in thread ]
Re: Oh please
Klingon was developed by a professional linguist, Mark Orkand, who did his graduate work in linguistics at Berkeley (his undergrad degree is from Georgetown), on contract to Paramount Pictures for Star Trek III. He is certainly not a "sci-fi dork," but he is most certainly a very accomplished professional linguist. Although the language is technically "invented," it has specific and fairly complex rules of grammar, syntax and morphology. There is even an invented culture to support specific terminology, complex multiple meanings, etc. so the semantics are well developed.
[ link to this | view in thread ]