Jaap van der Meer and the Future of Translation

The Dutch have always been keen on networking, whether inland with their interconnected canals, or overseas in their swashbuckling trade voyages to the East. These are broad historical generalizations, of course, but serve as a metaphor for Jaap van der Meer, a visionary Dutch linguist who carries on that legacy of networking, envisioning a world coming together through the power of translation and data collection. 

Born in The Hague in 1954, van der Meer attended the University of Amsterdam, where he majored in literature and linguistics. In 1980, van der Meer started his first translation company, INK, which developed translation memory and terminology lookup software; 25 years later, he founded TAUS, a think tank and language data network that offers the largest industry-shared repository of data in language engineering. Nearly two decades after that, van der Meer is now revered as an innovator, pioneer, and visionary in the contemporary language industry. 

 

So what’s his deal, and what kind of future does he advocate? On the TAUS website is van der Meer’s brief and succinct manifesto for the future of translation—“Reconfiguring the Translation Ecosystem”. “A reconfiguration of the translation system is inevitable and is in fact already in full swing,” van der Meer starts, and launches into introducing what he believes to be the future of the translation industry. 

Traditional translation and localization is the work of a creative, he argues, and the most expensive and time consuming; a general estimate of the cost of human translation is between 100,000 to 150,000 euros per 1 million words. With the advent of free MT platforms, the need for human creativity and intervention is on the decline; in its stead are markets for MT models and data, which offers near-human translations at a fraction of the prices. “In the transition phase that we are in now, AI technology is being molded into existing processes, which in turn leads to a deglamorized and devalued role for the human translator,” he claims. What we should be focusing on, instead, is devising better models for AI-powered translation and feeding it clean data.

Van der Meer repeats his argument in a 2021 article for the language-industry magazine MultiLingual titled “Translation Economics of the 2020s,” where he delves deeper into current developments in machine translation—as well as the language industry in general—and makes a prediction about the industry’s near future. 

An timeline of technological advancements. Jaap van der Meer, “Translation Economics of the 2020s”

One of the contentious predictions he makes is regarding the “mixed economy” condition of the translation industry, referring to the coexistence of human translators and free, near-zero-cost translation machines. “Once the right infrastructure is in place,” he writes, “the production of a new translation costs nearly nothing and capacity becomes infinite.” Van der Meer goes on to claim that the mixed economic model will no longer be sustainable in the future; machines will replace human translators, in line with the general current towards singularity. 

His ideas are not completely ungrounded, to be fair. Van der Meer cites a number of instances in which technological advancements have put to rout entire industries and businesses: Kodak (beat out by Sony’s digital cameras), Blockbuster (thanks to Netflix and streaming media), and the taxi industry (fighting against Uber and Lyft). The same goes for translation, he argues. “In 2019, Google alone translated 300 trillion words compared to an estimated 200 billion words translated by the professional translation industry,” he writes, “by 2025, enterprises will see 75% of the work of translators shift from creating translations to reviewing and editing machine translation output.”

The focus of the language industry, then, is no longer the quality and commerce of human translation, but rather the development and upkeep of translation machines. In this new paradigm, humans are no longer the most valuable resource. It’s data: data required to feed and train translation machines to perfection and human parity. It’s only sensible, van der Meer seems to argue, that the translation industry undergoes this paradigm shift, alongside numerous other industries facing similar reconfigurations. In a word where machines slowly work towards matching human capacity, data is king. And with data comes concerns over copyright: who has claim to source text? The translatum?

An overview of the modern translation pipeline. Jaap van der Meer, “Translation Economics of the 2020s”

Van der Meer’s predictions of a reconfigured, completely MT-powered future has been criticized by professional researchers and translators. Alan Melby of the International Federation of Translators (FIT) and Christopher Kurz, Head of Translation Management at ENERCON, responded to van der Meer in a follow-up article aptly titled “Data: Of course! MT: Useful or Risky. Translators: Here to Stay!” in which they argue for the necessity of human translators in upholding the rigorous standards of translation. 

Their vision of the future is much more hopeful for translators: “We believe that the current mixed economic model is not only sustainable but beneficial to society,” writes Melby and Kurz, “consequently, we believe that there is definitely a future for professional human translators.” To prove their point, the authors rebut van der Meer’s arguments in detail. 

The first flaw in van der Meer’s vision is his definition of data; Melby and Kurz point out that van der Meer is too vague about what kind of data he’s exactly dealing with. “It is not clear which data type is the focus of [van der Meer’s] article,” they say, “it also confusingly labels metadata as “translation data.” We reject this label for metadata.” For Melby and Kurz, there are numerous types of data with different usages in different contexts; in that sense, van der Meer’s article can only be construed as vague and nonspecific in its stance to how it will deal with data. 

The complicated nature of data leads to another refutation: that “translation cannot always be “zero cost.”” Given the numerous types of data (co-text, XLIFF, TMX, metadata, their subsets, etc.), human intervention is necessary to upkeep, maintain, categorize, and clean up data necessary to fuel translation machines. 

Another main argument posited by Melby and Kurz is that computers are simply not capable—and will take a long time, if not ever, to be capable—of “understanding” context in a document. “A system can be trained on massive amounts of data and produce impressive results without understanding language,” they remark, but point out that “[these results] have not brought us closer to an understanding of how humans process language.” In other words, machine translations are still, at best, mere text processors, incapable of understanding. And for this reason, machine translation could not possibly replace human translation in the short time span van der Meer claims. 

The Hans-Christian Boos Pyramid: a model of machine-learning processes. Alan K. Melby and Christopher Kurz, “Data: Of Course! MT: Useful or Risky. Translators: Here to Stay!”

Because of their lack of intelligence and true understanding, computers cannot be trusted to take on the nuances and complexities of translation in cases where “errors in the translation can cause damage, injury, or harm.” Human translation is not a “creative” act as van der Meer claims; if anything, “creativity that ignores agreed-upon requirements is unwanted in the majority of today’s professional translation industry.” Melby and Kurz’s idea of the human translator is one of rigid rules and strict standards—“fulfilling the production phase’s requirements.” And here is the crux of Melby and Kurz’s article: “humans can check their own behavior in the translation process and verify their translations against specification.” They doubt that machine translation systems can do the same. 

When asked about the criticism and debate his original article stirred up, van der Meer retorts that “people are locked up in their here and now, and they don’t see what’s really happening with the world.” Van der Meer sees the world as rapidly changing—at breakneck speed—given how fast the world has changed in the last decade or two. He places his faith in the evocative, seemingly limitless power of technology to innovate and reconfigure the language industry, and it sounds good, too—to think of a world in which words, sentences, and paragraphs are translated, nuanced and delicate, in the blink of an eye. “We humans have to outsmart the machines, which means we shouldn’t become slaves to them and do the stupid work of correcting their output.”

A pie chart of how much work machine translations can handle. Melby and Kurz, “Data: Of Course! MT: Useful or Risky. Translators: Here to Stay!”

But for Melby and Kurz, the “stupid work” is what translation entails. Translators have a duty to provide the best, most accurate translations for their clients and customers; if tedious post-editing is what it takes to do that, then that is what translators must do. In Melby and Kurz’s eyes, van der Meer is an idealist, obsessed with the nobleness of human vocations. Van der Meer’s utopia is one where humans don’t have to lift a finger to get work done; for Melby and Kurz, such a utopia destroys all raison d’être and poses no solutions for a pre-singularity time. 

Who is right? Only time will tell. Van der Meer is visionary and futuristic, but overly idealistic and vague. Melby and Kurz are experienced and professional, but nostalgic. Their differences come down to their belief in technology. And technology, as we all know, has impressed us when we least expected and let us down in our moments of need. 

 

References
https://multilingual.com/articles/jaap-van-der-meer/
https://blog.taus.net/reconfiguring-the-translation-ecosystem-in-the-2020s
https://multilingual.com/issues/july-august-2021/translation-economics-of-the-2020s/
https://multilingual.com/articles/data-of-course-mt-useful-or-risky-translators-here-to-stay/