26.07.2022

After automated post-editing, APE, the next new hot topic is automated (human) translation error correction, or TEC

Imperfections in machine translation (MT) have long motivated systems for improving translations post-hoc with automatic post-editing (APE). (1) In contrast, up to now, little attention had been devoted to the problem of automatically correcting human-generated translations. A study introducing for the first time ever an automatic human translation correction model (TEC) was presented by language processing (NLP) scientists at Lilt and the University of California at Berkeley (Jessy Lin, Geza Kovacs, Aditya Shastry, Joern Wuebker and John DeNero) in an award-winning paper titled “Automatic Correction of Human Translations”. Humans are often assumed to produce trusted, high-quality translations, say the scientists, but, in reality, they do make errors, including spelling and grammar, that machines would be well-suited to assist with. The task of TEC is mostly performed by humans, translators hired to review and edit translations, they add. The study investigated the hypothesis that a TEC system could help reviewers produce better final translations. Findings reveal that, indeed, the TEC system is helpful to real humans, assisting professional reviewers and leading them to produce higher quality reviewed translations.

Methodology and results

To investigate whether a TEC model could help reviewers produce higher quality final translations a bilingual corpus called ACED was used, which contained three datasets from different domains. The data consisted of 35,261 English-German translations, performed and edited by professional translators (not post-edited). One of the findings we at cApStAn found particularly interesting is “that human errors in TEC exhibit a more diverse range of errors and far fewer translation fluency errors than the MT errors in automatic post-editing (APE) datasets, suggesting the need for dedicated TEC models that are specialized to correct human errors”.

Real world applicability

In order to determine the real-world applicability of the TEC model nine professional translation editors were asked to review sentences (255 of which had suggestions for corrections) and provide feedback. Seventy-nine percent of the corrections were accepted by the reviewers. There were some issues of reliability by some of the reviewers but on the whole the study found that professional translators produced higher quality translations when assisted by a TEC model.

Other advantages of TEC

Some reviewers commented that a TEC system could be a memory aid or substitute for researching client-specific requirements or preferences, which is often an intensive part of the production translation process. Others commented that TEC could be useful as an attention-directing tool by making them aware of what errors they might look out for, especially in repetitive content where it may be easy to miss details.

The future of TEC

As the TEC model’s precision increases, says Jaap van der Meer, author of an article for Slator, the greater its potential to make a practical difference during the review stages of translation production, and TEC could be the next step in translation workflow automation. The researchers are planning to extend their work to other language pairs. To be followed!

Foonotes

1) Automatic Post-editing (APE) is an area of research aiming at exploring methods for learning from human post-edited data and applying the results to produce better Machine Translation (MT) output.

Sources

“Research on Automatic Correction of Human Translation Wins Award”, Rocío Txabarriaga, Slator, July 4, 2022

“Automatic Correction of Human Translations”, Jessy Lin, Geza Kovacs, Aditya Shastry, Joern Wuebker, John DeNero, North American Chapter of the Association for Computational Linguistics (NAACL), June 17, 2022

Photo credit: Shutterstock