Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Customized Machine Translation Mis-Use as a Cause of Shadow (Unpaid) Translation Work In “Emerging Technologies As A Cause of Shadow Work”, I have reviewed how emerging technologies have a tendency to increase the amount of shadow (unpaid) work. In this post, I review, in more detail, one of the main causes of shadow work in the Language Services industry. Over the last five years, enterprises and large language service providers have started to develop custom Machine Translation engines in an effort to improve translation productivity. An enterprise, may find, for example, in its translation memories, 50 million words of translated, product documentation data and use it to build a custom MT engine. When tested on product documentation, for some language pairs, the custom engine outputs are better than any other generic machine translation engine (labeled as MT 2 and MT 3 in the lead figure). In production tests, the custom engine is shown to also lead to increases in translation productivity so the engine becomes part of the translation production process. Professional translators are typically paid less to post-edit outputs produced by this engine, but since they are able to translate more text, overall, their translation income remains the same. The amount of shadow work they typically do in this instance is small. Encouraged by their initial success, some enterprises and language service providers become over enthusiastic: having “proven” that their custom MT engine works well on product documentation, they go on and deploy it in translation workflows that involve other content types – customer support and marketing documents, for example. The problem is that custom MT engines are pretty dumb: although they may produce good outputs on the narrow domains they have been trained on, they perform abysmally on texts coming from other domains and content types. In fact, in many instances, outside the domain/content types they have been trained on, they perform worse than generic, online Machine Translation engines that are trained on billions of words of human translated data. This state of affairs, which is depicted graphically on the right side of the lead figure, creates large amounts of shadow (unpaid) work. Professional translators should be aware of these shortcomings, assess the degree to which the post-editing work they are asked to do is compensated fairly, and push back on the language service buyers that mis-use machine translation technology. Post-editing pricing agreed upon in the context of one domain/content type should never be used in the context of a different domain/content type.