About Yeheon Hong

This author has not yet filled in any details.
So far Yeheon Hong has created 51 blog entries.

The Microsoft Style Guide Part 5: Punctuation

In Quack This Way, a book about language and writing by lexicographer Bryan A. Garner and writer David Foster Wallace, Wallace notes that “punctuation isn’t merely a matter of pacing or how you would read something out loud. These marks are, in fact, cues to the reader for how very quickly to organize the various phrases and clauses of the sentence so the sentence as a whole makes sense.” 

For something so small as a comma or a period, punctuation plays a critical role in writing. It’s not just a formality we pick up in grade school; punctuation can make or break a written text, because it delineates writing into segments we can process and understand. 

Punctuation doesn’t stop at that, however. Form and content are difficult to differentiate in good writing, and writers have taken advantage of the variety of punctuation to not only clarify their writing, but also to stylize it and give their stories life. Take this image by neurologist Adam J. Calhoun, for example:

Punctuation in Blood Meridian by Cormac McCarthy (left) and in Absalom, Absalom! by William Faulkner (right). Image credits: https://medium.com/@neuroecology/punctuation-in-novels-8f316d542ec4

Calhoun has stripped Blood Meridian and Absalom, Absalom! (phenomenal novels by Cormac McCarthy and William Faulkner, respectively) of words, leaving behind only punctuation marks. Examining only the punctuation of the novel says much about the novel. 

There’s McCarthy, with his fields of periods, interspersed by the occasional question mark. We can assume, only by looking at the punctuation, that the work is a largely static and quiet one. There’s no quotation mark to be seen anywhere; modern writers do like to omit quotation marks, opting for italics instead. On the other hand, there’s Faulkner with his numerous parenthesis—some of them containing up to 5 punctuation marks inside them—and his interminable sequence of commas that seem to never end. His sentences are long and winding, fearless of nested clauses and phrases.

Calhoun also compares other well-known novels by the kind of quotation marks used and how often they appear:

A Farewell to Arms, for example, is chock full of quotation marks, which suggests an abundance of dialogue and conversations. Alice in Wonderland, however, boasts the most frequent usage of single quote marks. Save for Blood Meridian and A Farewell to Arms, commas are almost always the most popular punctuation of choice. This is all to say that punctuation doesn’t just sit idly by; rather, it brings a work to life. It’s an important part of writing, to say the least, and that’s why we should treat it as such. 

The Microsoft Style Guide has a number of guidelines regarding the use of punctuation. Most of it is stylistic or grammatical, but conforming to these rules will help you write in that crisp, friendly, and amiable Microsoft voice that we’ve all come to love. By mastering these punctuation guides, you’ll not only be able to write more concise, logical texts, but also learn to manipulate the power of punctuation to amplify and saturate your voice with more nuance.



In the English language, the apostrophe is used to form the possessive case of nouns. Unlike what many of us have been taught in grade school, the Style Guide advises adding an apostrophe and an s, “even if the noun ends in s, x, or z” to avoid confusion. After all, some words that end in -s might not always be plural; the apostrophe and the subsequent s should serve to clarify the plurality of the noun preceding them. Apostrophes are also used in the English language to form contractions. Apostrophes are used in the place of the missing word, such as in can’t, don’t, and it’s. 

There are some places that apostrophes don’t really belong in—although you see them used here often, especially on the internet. Don’t use an apostrophe to write the possessive form of it; the correct word is its. Don’t use an apostrophe with possessive pronouns—your’s and their’s are incorrect. Finally, don’t use an apostrophe to form plurals. 



Colons and semicolons are a tricky couple. Their usage can be difficult to differentiate at times, but the Microsoft Style Guide has some solid guidelines that might help you use them with more purpose and deliberation.

The first function of a colon is to mark the start of a list. Put a colon at the end of a phrase to introduce a list. This is straightforward enough, like in this sentence, “I love three things: coffee, tea, and freedom.” 

The second function of a colon is to demarcate a statement from its expansion. Put a colon at the end of a statement when you want to follow it with a second statement that expands on the previous one. The Style Guide does remark that the colon in this function should be used sparingly. Here is Microsoft’s example: “Microsoft ActiveSync doesn’t recognize this device for one of two reasons: the device wasn’t connected properly or the device isn’t a smartphone.”

To this, the guide notes that “most of the time, two sentences are more readable.” The colon can be replaced—to greater effect—by a period. But in case you do choose to use a colon, make sure to lowercase the word the follows the colon. 

There are exceptions, of course, to the lowercase rule. If you’re listing city names, capitalization is unavoidable. Plus, the third function of the colon is to introduce a direct quotation, as in: “The colon introduces a direct quotation.” In such a case, capitalizing the word after the colon is unavoidable as well.

Lastly, the fourth and final function of the colon is to demarcate the title from its subtitle, such as “Block party: Communities use Minecraft to create public spaces.”



As seen at the beginning of this post, commas are one of the most frequently used punctuation marks, followed closely by periods and single quote marks. The uses of the comma are many; it’s a flexible, multi-purpose tool, used by writers as a primary tool of demarcation and organization.

The Microsoft Style Guide lists the usual functions of the comma, which you all probably know. The comma is used when listing three or more items, like in the sentence “Outlook includes Mail, Calendar, People, and Tasks.” Note how the guide advocates the use of the Oxford or serial comma, or the comma right before the conjunction, as is to be expected by a U.S. corporation.

The comma follows an introductory phrase, which is a subordinate or dependent clauses that precedes an independent clause. Take, for example, this sentence: “With the Skype app, you can call any phone.” The comma also joins independent clauses with a conjunction. In other words, two or more independent clauses can be connected with a comma and a conjunction, like in the sentence “Select Options, and then select Enable fast saves.”

The comma is also used to replace the word and, like in the sentence “Adjust the innovative, built-in Kickstand and Type Cover.” The guide, however, does advise you to avoid this type of phrasing and opt instead for a friendlier, conversational tone.

Lastly, a formal usage of the comma that some might not know: use a comma to surround the year the writing a complete date within a sentence. Here is an example by Microsoft: “See the product reviews in the February 4, 2015, issue of the New York Times. Note how the year 2015 is surrounded by commas.


Dashes and hyphens

Dashes and hyphens often go by unnoticed in grade-school grammar lessons. In writing, the em dash (—), the en dash (–) and the hyphen (-), each with their distinct lengths, are used in different scenarios.

Em dashes are frequently used in writing for emphasis. Take, for example, this beautiful sentence by Mary Gaitskill in her novel Bad Behavior: “Because once, when I was about twelve, I was in my father’s study rubbing his neck—I used to do that all the time for him—and there was this Playboy calendar over his desk and some babe was on it and I said to him, ‘Do you like her?’ and he said, ‘Sure I do,’ and I said, ‘Would you like to meet her?’ and he looked shocked and said, ‘No, she’s just a dumb broad.” 

The em dash is also used at the end of sentences, like in this sentence from Rachel Cusk’s Kudos: “There was this antique telephone on the desk and I kept wanting to call someone up and get them to come and rescue me. One day, I finally picked it up and it wasn’t connected—it was just a decoration.”

En dashes are used more for formatting, like indicating a range of numbers, such as 2015–2017, or a minus sign (12 – 3), or a negative number. En dashes also replace the hyphen in a compound modifier when “one element of the modifier is an open compound,” such as Windows 10-compatible or dialog box-type.

Lastly, hyphens. Hyphens are used to connect two or more words that precede and modify a noun as a unit. There are caveats, however. If the compound modifier makes sense and isn’t confusing without the hyphen, it’s okay not to use the hyphen. Hyphens should be used when one of the words in a compound modifier are a past or present participle (“left-aligned text,” “free-flowing form”). 

We could go on for a long time discussing the uses of the hyphen, from its use in compound numerals and fractions (“twenty-fifth,” “one-third”) or confusing prefixes (“non-native,” “non-XML”). It’s important that, when writing in the Microsoft voice, you keep the guide close at hand, referring to it when necessary.


Exclamation marks and question marks

The rules for the exclamation mark (!) and the question mark (?) are simple: use them very sparingly, only when needed.


Quotation marks

The guide is very strict on the use of quotation marks—more so, perhaps, than other punctuation. After all, quotation mark usage varies across regions and countries, and as such, there is a greater need for standardization.

For example, the guide advises the following: “In most content, use double quotation marks (“ “) not single quotation marks (‘ ‘)… In printed content, use curly quotation marks(“ ”)… In online content, use straight quotation makrs.”

The guide also suggests you use the terms quotation marks, opening quotation marks, and closing quotation marks instead of quote marks, quotes, open or close quotation marks, or beginning or ending quotation marks.



The guide generally disapproves of semicolons. They are used mostly in writing to demarcate sentence breaks and join independent clauses. Semicolons don’t help much in speeech; after all, punctuation has more to do with the silence between the words, and it’s unclear what kind of silence or break the semicolon is supposed to signal.

But if need be, semicolons can be used. Semicolons join independent clauses without the need for a conjunction (“Select Options; then select Automatic backups.”). It also connects contrasting statements without a conjunction (“What’s considered powerful changes over time; today’s advanced feature might be commonplace tomorrow.”). 


We hope you enjoyed this fifth and final part of our introduction to the Microsoft Style Guide. For writers, especially in the tech or business industry, the Microsoft Style Guide will polish and refine your writing, rendering your words more comprehensible and legible to those you wish to communicate to. It’s important to keep on practicing, making sure you follow the guidelines.


The Microsoft Style Guide Part 4: Global Communications

If you’ve read our previous blog posts, you might know that Microsoft is dedicated to upholding certain social standards with its list of rules and suggestions outlined in the Microsoft Style Guide. The previous sections dealt with social issues such as sexism and racism, but in this section—Global Communications—we’re moving away from these universally entrenched issues and focusing more on specific taboos and values that vary by region and culture.

The Global Communications section is particularly important for localization experts, whose job is to make sure content written in one language is not translated into another language in a way that can be harmful or offensive to someone else’s culture. Plus, it helps to know more about other cultures; there are moments in one’s translation career in which one must write or localize content that deals with unfamiliar cultures.

If so, this is the place for you. The Microsoft Style Guide offers localization tips and rules in an array of topics ranging from art and currency to HTML considerations. 



Art, in the context of the style guide, includes the use of colors and images in content shared and spread globally. 

Color, for many people, is one of the primary ways of indicating ideas across the world. However, colors can mean different things to different cultures, which has been the source of constant strife for people trying to convey visual messages across the world. An article by Erikesen Translations explains how colors are carriers of ideology:

  • “in former Eastern European Bloc countries, red can still evoke associations with communism.”
  • “Blue is… often considered a safe color for a global audience, because it lacks significant negative connotations.”
  • “green brings up negative connotations in Indonesia, where it is regarded as a forbidden color, representing exorcism and infidenlity… In South America, however, green is the color of death.”
  • “in the Middle East, [orange] is associated with mourning and loss.”
  • “In Egypt in much of Latin America, [yellow] is linked to death and mourning.”
  • “in some Latin American countries, such as Colombia and Nicaragua, the color [brown] can be met with disapproval.”

Images can also come off as offensive—from depictions of specific social situations to artistic realizations of English-language idioms—and as such, the style guide recommends that writers “choose simple or generic images that are appropriate worldwide.”

Examples of appropriate images worldwide are “soccer players and equipment, generic landscapes and settings, pens and pencils, international highway signs, and historic artifacts.” On the other hand, the style guide recommends staying away from seasonal images, holidays, and major landmarks and famous buildings (“which may have legal protections or be associated with politics or religion”). Also not recommended are depictions of social situations involving men and women (“risky in a few locales”), hand signs, and art based on English idioms. 

These precautions only pertain to the artistic aspect of image as a carrier of meaning. There are, of course, technological aspects to consider. The style guide advises limiting online graphics and animations, as long page-loading times can be expensive in some countries. It also recommends making it easy to edit the text in graphics; translators might know how difficult it can be to translate text in graphics (impossible at times).

Furthermore, storing art in separate files and linking to it from within a document can help localizers, who can modify art much more easily if it isn’t embedded within the document. Finally, it’s imperative that localizers “check restrictions on imported content,” especially maps, which can be heavily regulated depending on the country. 



A general rule of thumb when writing currency names is to “lowercase the names of currencies, but capitalize the reference to the country or region.” Here are some examples provided by the style guide: US dollar, Canadian dollar, Hong Kong SAR dollar, Brazilian real, and South African rand. The style guide does allow, however, capitalization of the currency name, but only in a structured list, such as “a table that compares available pricing options.”

More important is writing about currency; in written texts, specific monetary amounts are dealt with more frequently than the actual names of the currency themselves. For this, the style guide recommends “us[ing] the currency code, followed by the amount, with no space.” Here is the example it provides: “The company generated BRL2.89 billion (USD1.42 billion) in net revenue in 2015.”

It is OK, however, to use only the symbol when it is clear what currency is being dealt with, such as the example sentence: “Adatum Corporation generated €1.42 billion in net revenue in 2015.”


Examples and scenarios

In technical writing, use-case scenarios—“detailed descriptions of specific customer interactions with a product, service, or technology”—are unavoidable. They make it easier for users and customers to understand the product by using fictional people in scenarios. Creating fictional space and people, however, can be risky; one culture’s notion of acceptable situations may not fly in other cultures. The style guide provides three specific guidelines for localizers looking to globalize examples and use-case scenarios.

  • Perception of use-case scenarios differs by culture. The style guide notes how “in some cultures men and women don’t touch in public, even to shake hands,” and how “greeting cards are uncommon in many parts of the world.” 
  • Don’t mention real places. If you need to, “vary the locales from one example to the next” so as not to show bias towards a specific region.
  • Keep in mind that certain technologies and standards aren’t used worldwide. The style guide notes that “standards vary, from phone, mobile, wireless, and video to measurement, paper size, character sets, and text direction.” While US standards are, in practice, used commonly in certain parts of the world, standards vary more than they are similar.


Names and contact information

If you’re from a country that doesn’t follow the US standard for name inscription (given name first, family name second), you might have had some difficulty filling out forms that abide by such a standard. It’s always important to note that different cultures have different ways of writing names, and as such, forms in which names are mandatory should be as lenient as possible to accommodate for such differences.

For example, in Arabic culture—and other areas across Africa and Asia—one’s given name is connected with a “chain of names, starting with the name of the person’s father and then the father’s father and so on.” Other cultures—Russian, Scandinavian, and so on—use patronymics or matronymics, which are added to the names of one’s parents. For example, the Russian composer Pyotr Ilyich Tchaikovsky’s name comes from his father’s given name, which was Ilya. The singer Björk Guðmundsdóttir’s name comes from her father, whose name is Guðmundur. Some eastern cultures reverse the position of the given and family names.

To accommodate for these discrepancies, the Microsoft Style Guide suggests using these guidelines to allow for more variance in name input.

  • Use First name and Last name in forms, or simply Full name.
  • If you include a Middle name field, make it optional.
  • Use Title, not Honorific, to describe words such as Mr. and Mrs. Not all cultures have equivalents to some titles used in the United States, such as Ms.

Names aren’t the only things addressed by the guide. Addresses are a constant source of headaches, especially when filling out forms from other countries.

  • Provide fields long enough for customers to include whatever information is appropriate for their locale.
  • Use State or province instead of State. Fields that might not be relevant everywhere, such as State or province, should be optional.
  • Use Country or region instead of just Country to accommodate disputed territories. It’s OK to use Country/Region if space is limited.
  • Include a field for Country or region code if you need information for mailing between European countries or regions. It’s OK to use Country/Region code if space is limited.
  • Use Postal code instead of ZIP Code. Allow for at least 10 characters and a combination of letters and numbers.

Phone numbers are, of course, also important to accommodate.

  • Provide enough space for long phone numbers.


Time and place

The official format for dates, as designated by the Microsoft Style Guide, is month dd, yyyy, in which the month is written out in letters, not in numerals. As other countries use different time formats—dd/mm/yyyy, yyyy/mm/dd, etc.—writing the month out helps avoid confusion. 

Also note that referring to seasons should be avoided, instead talking about months or calendar quarters. This is to prevent confusion and/or bias for people living in the southern hemisphere. 


Additional writing tips

So far, we’ve covered specific formats and visual choices, but in the end, localization practices benefit most from logical, consistent, and clear writing choices. Good writing helps localizers translate content better; in cases where it is inevitable for English content to be used without translation, good, clear writing can help facilitate understanding. With this in mind, the Microsoft Style Guide has a number of writing tips to improve the quality of your writing.

  • Write short, simple sentences. If your sentences have too many commas and other punctuation, it most likely means your sentence is too complicated. Short, simple sentences are much easier to translate and understand.
  • Use lists and tables instead of complex sentences and paragraphs.
  • Use that and who to clarify the sentence structure. These relative pronouns explicitly show which part of the sentence is referring to which; without them, non-native English speakers might have a harder time understanding.
  • Include articles, such as the. While there are certain writing styles that do not use articles, commonly used in technical writing, they are usually discouraged, as it confuses the reader for the sake of more compact writing.
  • Do not use idioms, colloquial expressions, and culture-specific references. These expressions and references are often dependent on the history and culture of a specific country—i.e. the U.S.—and can confuse readers who are not from this specific part of the world.
  • Stay away from modifier stacks, which are long chains of modifying words, confusing even to native English speakers and difficult for localizers to translate properly.
  • Place adjectives and adverbs close to the words they modify.

Writing with simplicity and concision is good practice, not just for legibility and localization, but also for machine translation, which is now used more often than ever, thanks to advances in technology. For a machine, the simpler a sentence, the more accurate the translation. If you know your content will be machine-translated, try abiding by these guidelines. 

  • Use conventional English grammar and punctuation.
  • Use simple sentence structures.
  • Use one word for a concept, and use it consistently. Machines have a hard time dealing with duplicity and multiplicity of meaning for a single word. 
  • Limit your use of sentence fragments.
  • Use words ending in -ing carefully. Words that end in -ing often have multiple functions—as a verb, an adjective, or a noun—and can confuse the machine.
  • Use words ending in -ed carefully, for the reasons outlined above.


These are just some of the numerous localization and translation practices that make for a more inclusive, friendly online atmosphere for users all over the world. We hope that you enjoyed today’s content, and make sure to check out the next and last part of our series on the Microsoft Style Guide, in which we’ll be dealing with specific grammar and punctuation rules.



The Microsoft Style Guide Part 3: Bias-free Communication

In our last blog post, we spoke a bit about special terms and guidelines. Moving away from formal writing techniques, we now delve into the tricky realm of communication, which is what this style guide aims to facilitate and advocate.

After all, Microsoft is one of the most recognizable brands in the world. Most everyone has used, or uses, a Microsoft product. With its domination of the technology market comes a greater responsibility to cater to the specific cultural needs of its customers, which brings us to the question: how does Microsoft advertise and provide service in a way that is inclusive of such a varied audience?

A bias-free mode of communication—in which no one is left out, on purpose or by accident—is high on Microsoft’s list. There is a section in the Microsoft Style Guide that covers this topic extensively. “Microsoft technology reaches every part of the globe, so it’s critical that all our communications are inclusive and diverse,” says the guide. Microsoft manages to stay relevant and appeal to everyone precisely because it actively strives for bias-free communication. 

So how does Microsoft do it? There are a number of ways that Microsoft combats bias in writing, given that language is fraught with biases that often go unchecked when we write. As such, the list is nonexhaustive, and this style guide offers general methods that deal with the most pressing issues we face: sexism, racism, and ableism. Without further ado, here are some ways in which Microsoft combats bias in its writing. 


Gender-neutral alternatives for common terms

The English language, like most other languages, has a history of using gendered occupations and descriptions. You might be familiar with the English language’s tendency to use the word “man” as a generic term for all humankind, as in this quote from Abraham Lincoln’s Speech to the One Hundred Sixty-fourth Ohio Regiment: “We have, as all will agree, a free Government, where every man has a right to be equal with every other man.” 

If there are ways—and there are many—to avoid gender-specific language in favor of more neutral choices, it is only fair that we do so. The Microsoft Style Guide offers the following examples:

chairman → chair, moderator

man, mankind → humanity, people, humankind

mans → operates, staffs

salesman → sales representative

manmade → synthetic, manufactured

manpower → workforce, staff, personnel


Avoid using gendered pronouns in generic situations

For the same reason mentioned above, masculine pronouns are often used as generic placeholders in simulations and examples. The Microsoft Style Guide recommends avoiding using any gendered pronouns in such cases, instead opting for the second person you, the plural pronoun we or they, specific roles (e.g. reader, employee), or the gender-neutral terms person or individual.

The guide also frowns upon constructions like he/she or s/he; such constructions are binary and hence reductive. The modern English language is malleable enough to accommodate the usage of the 3rd person plural they as a singular, indeterminate pronoun. Here are some examples provided by the style guide:

If the user has the appropriate rights, he can set other users’ passwords. 

→ If you have the appropriate rights, you can set other users’ passwords. 

If you want to call someone who isn’t in your Contacts list, you can dial his or her phone number using the dial pad. 

→ If you want to call someone who isn’t in your Contacts list, you can dial their phone number using the dial pad.


Exceptions to gender-neutral pronoun usage

The style guide leaves some room for certain specific situations that do require gender-specific pronouns. There are three occasions where this is acceptable: writing about real people, quoting directly from someone’s words, or discussing gender-specific issues.

[An acceptable example of writing about real people]

The skills that Claire developed in the Marines helped her move into a thriving technology career.

[An acceptable example of citing a quotation]

The chief operating officer of Munson’s Pickles and Preserves Farm says, “My great uncle Isaac, who employed his brothers, sisters, mom, and dad, knew that they—and his customers—were depending on him.”

[An acceptable example of writing about gender-specific issues]

Do you have a daughter? Here are a few things you can do to inspire and support her interest in STEM subjects.


Representing diverse perspectives and circumstances

Oftentimes, writers default to a man of a certain racial or socioeconomic background when writing generic examples. It’s crucial that writers actively fight this urge to fall back to the norm when so many viable options exist. After all, the tech world—and the entire world at large—is so diverse. “Be inclusive of gender identity, race, culture, ability, age, sexual orientation, and socioeconomic class,” the style guide advises, “avoid using examples that reflect primarily a Western or affluent lifestyle.” 

This principle pertains not only to depictions of work life but also to personal and family settings as well. Diversity exists in all forms and shapes, and Microsoft is an advocate of diversity in all its various manifestations in the world.


Avoid generalizations when writing about the world

The world is big, but the space we as individuals occupy is quite small. In our corner of the world, it’s easy to think of the rest of the world in generalizations; it helps us make better sense of the world. 

But the world is so much more complex than that. Each person, each culture is full of nuances, complexities, and paradoxes that lie at the heart of their individuality and uniqueness. And when we generalize a person, culture, or country, we reduce their individuality down to a bite-sized piece, rendering them understandable to us. And in this process, we erase the subjectivity of another person—numerous people, in fact. 

This doesn’t only apply to negative generalizations. “Don’t make generalizations about people, countries, regions, and cultures,” says the guide, “not even positive or neutral generalizations.” Even positive generalizations are demeaning and reductive. So how do we write without generalizing? 

It’s simpler than it sounds. When we write about someone or someplace far away from us, we should always run the mental exercise of critiquing our own thoughts. We should ask ourselves, “is this a fair, nuanced representation of the culture I am writing about?” or “would I like it if someone else wrote about me like this?” 

That last question is particularly important: we should always practice substituting ourselves in the stead of others. It encourages us to think from the perspective of others so that we learn to treat others as we treat ourselves. 


Don’t use terms that carry unconscious racial bias or terms associated with military actions, politics, or historical events and eras

Take the master/slave analogy for example. Many technical writers and engineers use this analogy when referring to a hierarchy of networks and connections. The use of this analogy flies under the radar, as the terms are used merely to describe machines and networks, not people.

But etymology is crucial in writing; it’s good to ask ourselves where a particular phrase comes from, what kind of connotations it carries, and how different people might understand it differently. Master/slave is an important example, not only because it is widespread in technical writing, but also because it stems from a particularly painful, longstanding history of slavery. Some people might find this unproblematic; others might find it offensive and threatening. 

The same applies to terms that refer to military actions and politics. Political terms are, by nature, divisive and controversial, such as the Liancourt Rocks located between South Korea and Japan. Depending on what name you use to call the isles, you can offend a whole population, although the neutral term “Liancourt Rocks” in itself can be considered imperialist. 

Make sure to understand the political and societal nuances of a region or concept before writing about it. The world is a lived place, populated by people with very real experiences, and language reflects these experiences, is fraught with them. 


If you’re curious about other usage prescriptions or general principles in tech writing, the Microsoft Style Guide has all the information you need to get you on track. Come back for the next part of our Microsoft Style Guide special, where we cover all the interesting and useful information the guide has to offer. If you’re curious about how Sprok DTS uses the Microsoft Style Guide in its translation and localization, visit our website today and take a look at the wide variety of language services we provide.

Our translators and localization experts here at Sprok DTS are knowledgeable in various styles of writing, the Microsoft Style Guide included. Ask for a free quote for your next translation or localization project on our website.






The Microsoft Style Guide Part 2: Special Terms and Guidelines

In the previous blog post, we introduced the Microsoft Style Guide: Microsoft’s tech writing manual that emphasizes a clear, friendly voice and provides unified terminology and writing guidelines. Today, we’re exploring the Microsoft Style Guide’s special term collections, which are lists of set words and phrases used to avoid confusion, misunderstanding, and possible discrimination.

Microsoft’s term collections cover only a small portion of the extensive terminology it proffers, but these collections are set aside due to their importance and frequency of use. From terms about accessibility to those about date and time, these collections are useful to have in mind as you try to jot down or edit your writing. 


Accessibility terms

Microsoft has a long history of making its machines accessible for people with disabilities; Paul Schroeder of the American Foundation for the Blind even goes on to say that Microsoft has made “the strongest, most visible commitment to accessibility of any technology company.” Its efforts aren’t completely perfect—some versions of Internet Explorer have been less accommodating than others—but Microsoft has invested much effort into providing accessibility options for those that need it. 

It’s only natural, then, that the style guide offers specific terminology for phrases and words regarding accessibility. “Write in a way that puts people first,” the guide starts, “don’t use language that defines people by their disability.” It’s the right thing to do, of course, to make sure no one is left behind in the wake of technology, and this style guide aims to right some wrongly worded phrases that people use too often without thinking once again about how people with disabilities might want to be written. 

Here are some of the most important terms outlined in the guide:

  • Sight-impaired, vision-impaired → blind, has low vision
  • Hearing-impaired → deaf, hard-of-hearing
  • Crippled, lame → Has limited mobility, has a mobility or physical disability
  • Dumb, mute → Is unable to speak, uses synthetic speech
  • Affected by, stricken with, suffers from, a victim of, an epileptic → has multiple sclerosis, cerebral palsy, a seizure disorder, or muscular dystrophy
  • Normal, able-bodied, healthy → without disabilities
  • Maimed, missing a limb → person with a prosthetic limb, person without a limb
  • The disabled, disabled people, people with handicaps, the handicapped → people with disabilities
  • Slow learner, mentally handicapped, differently abled → cognitive disabilities, developmental disabilities
  • TT/TTD → TTY (to refer to the telecommunication device)

To opt to use these words in place of others: that is the power of prescriptive terminology. It forces us to reckon with the ways in which we have been less than inclusive and helps us make better choices in our writing—choices that will help render our writing more inclusive for people of all backgrounds. 

Microsoft does not, however, prohibit the use of certain ability-specific verbs, such as see, read, or look, when calling out an example. 


AI and bot terms

Strangely enough, Microsoft recommends writers to “avoid talking about AI and bot technology”: strange, coming from a company that champions AI and bot technology. The issue is mainly with the perception of AI—it’s vague and scientific and comes off as foreign and unfamiliar for many people. Furthermore, given the short history of the field of AI, writers are more liable to create new terms for unfamiliar technological concepts. 

Here are some examples of AI- and bot-related concepts whose usage is clarified and defined by the Microsoft Style Guide:

  • AI: the guide discourages the spelled-out form, artificial intelligence, although the words intelligent and intelligence can be used to talk about the benefits of AI.
  • bot, chatbot, virtual agent: bot refers to “an app that performs automated tasks or engages with humans through a conversational interface.”
  • intelligent technology: Microsoft only advises using the term intelligent technology in UI contexts to describe underlying technology powering AI features. Smart technology as a phrase should not be used.


Cloud-computing terms

The cloud—as in cloud computing—is one of those black-box concepts: people understand their use in real life but don’t know quite how it works. The field of cloud computing is always evolving, and keeping up with standardized terminology is important to avoid confusion. As a company that develops much cloud-computing software and services, Microsoft is very particular about what phrases and words make the cut and which ones don’t.

  • cloud, the cloud: cloud shouldn’t be capitalized except when it’s part of a product name. Furthermore, cloud should be used mostly as an adjective, as in cloud computing or cloud services.
  • infrastructure as a service (IaaS): Microsoft recommends that writers “use [this term] for technical audiences only.” When first mentioned, it should be spelled out and followed by the abbreviation in parenthesis; afterward, it can be used in its abbreviated form, which is IaaS, not IAAS. No hyphens.

A lot of the cloud-computing terms tend to be technical, and writers who aren’t familiar with these concepts might take creative liberties with terminology, perhaps using community cloud as a synonym of hybrid cloud or private cloud, when the style guide explicitly states: “never use.”


Computer and device terms

Unlike cloud-computing terms—which are difficult and therefore confusing—computer and device terms are confusing for the opposite reason: verbal and colloquial variance due to frequent use. For example, you can say power on as a synonym for switch on and turn on, although the style guide only approves of turn on as the proper verb phrase for starting a device. The style guide also differentiates between set up and install: set up is “preparing hardware or software for first use,” whereas install refers to “adding… hardware drivers and apps.”

This computer and device term collection is intriguing at the very least; it puts an end to much confusion surrounding spelling and usage of common household electronic devices. For example, according to the Microsoft Style Guide, adapter is the proper term, not adaptor. CDs and DVDs are discs, although Azure cloud storage and virtual machines employ disks in their systems. Display is the “general term for any visual output device,” and screen is “the usable portion of the display from its edges,” whereas monitor only refers specifically to “a standalone desktop or mounted display device.” 

At some point, the guide gives up, particularly for words such as drive, which is very multifold in its meaning and hence confusing in certain contexts. 

Use drive as the general term for any type of device where a customer can save or retrieve files, including hard drive, CD drive, DVD drive, USB flash drive, or any other removable storage device. Use hard drive when necessary to refer to a drive on a PC where programs are typically stored. Avoid referring to the type of drive if you can.


Date and time terms

Microsoft, being an American company, follows the month-day-year format that so many people abhor, but what can we say? 

Aside from that controversy, the Microsoft Style Guide’s date and time term collection has numerous insightful formatting recommendations. For example, “midnight is the beginning of the new day, not the end of the old one,” a tidbit of information many don’t know. The guide also specifies that the ratio symbol Unicode 2236 should be used as the delimiter between hours, minutes, and seconds, rather than the standard colon. The difference? “A standard colon is baseline aligned, the ratio symbol, on the other hand, is vertically centered between the baseline and cap-height.” 

The difference between a ratio symbol and a colon.


Keys and keyboard shortcuts

Have you ever tried explaining to a computer-illiterate person how to navigate a website or recover their password? If you have—even if you haven’t—you’ll know the difficulty of explaining, verbally or in writing—which keys to press, in which combination, in what order.

Microsoft’s keyboard action term collection is the largest by date, perhaps because of this reason. There are numerous ways to say the same thing, and this creates much confusion on the part of the reader or listener. Do access key and keyboard shortcut mean the same thing? What about key combination? Is Alt, as in the Alt key, capitalized? Do we spell out the @ sign? Do we refer to # as the pound key, or maybe a hashtag? 

Take the term select, for example. People often use the following verbs as an alternative to the term select: press, depress, hit, strike, and use. As a solution, the Microsoft Style Guide prescribes the following guidelines:

Use select to describe pressing a key on a physical or on-screen keyboard. Don’t use press, depress, hit, or strike. 

Don’t use depressed to describe an indented toolbar button unless you have no other choice.

Use use when select might be confusing, such as when referring to the arrow keys or function keys and select might make customers think that they need to select all the arrow keys. 

Use use when multiple platform or peripheral choices initiate the same action or actions within a program.

Use select and hold only if a delay is built into the software or hardware interaction. Don’t use select and hold when referring to a mouse button unless you’re teaching beginning skills.


If you’re curious about other usage prescriptions or general principles in tech writing, the Microsoft Style Guide has all the information you need to get you on track. Come back for the next part in our Microsoft Style Guide special, where we cover all the interesting and useful information the guide has to offer. If you’re curious about how Sprok DTS uses the Microsoft Style Guide in its translation and localization, visit our website today and take a look at the wide variety of language services we provide.

Our translators and localization experts here at Sprok DTS are knowledgeable in various styles of writing, the Microsoft Style Guide included. Ask for a free quote for your next translation or localization project on our website.


The Microsoft Style Guide Part 1: a Brief Introduction

There are a plethora of style guides in the world. There are MLA, Turabian, Chicago, and APA if you’re working in academia. There’s also specialized ones, such as the Business style for professional business communication; AMA for the medical field; and AP and NYT manuals for journalists. 

The Microsoft Style Guide sits alongside the Apple Style Guide and the Google Developer Documentation Style Guide as the more frequently used style guides in the tech sector. Like all style guides, Microsoft’s exists for one reason: to formalize terminology, eliminate ambiguity, and ease communication. Language, in its nature, is divergent and oftentimes contradictory, leading to miscommunication—which is detrimental to industries that deal with specifics and precision. Style guides solve this problem by setting rules of practice and suggesting a singular mode of writing that seeks to clarify. 

This isn’t to say that these guides are defined, set rules that, when disobeyed, render your writing incorrect. Style guides serve as suggestions for better methodologies in writing but don’t do much more than suggest or recommend. “Break the rules,” says the Google Developer Documentation Style Guide, “depart from it when doing so improves your content.” The end goal of style guides is not the absolute adherence to them, but rather, better comprehension for readers of your writing.

In our five-part series on the Microsoft Style Guide, we will cover the basic premises of the Microsoft Style Guide. For people working in technology, learning more about the guide will acquaint you with the formal writing practices of the tech industry. Given the didactic, comprehensive nature of the guide, readers might find themselves better writers by the time they read through the entire manual. 


Why the Microsoft Style Guide?

In schools, the most frequently used style guides are MLA, APA, and Chicago; what these guides have in common is that they are dense and complicated. It figures, since academic writing often covers a broad range of sources and subjects. 

But Microsoft’s guide is catered more to a general audience; the style is meant to formalize Microsoft advertising practices and facilitate troubleshooting processes for people using Microsoft products. Hence, Microsoft’s “brand voice” sounds less like a droning professor and more like Alexa or Siri: simple, direct, and most importantly, approachable. 

The Microsoft Style Guide is similar to Apple’s and Google’s: they all advocate for a clear, warm, precise tone and inclusive writing practices for an optimal style. What sets Microsoft apart is its position as the de facto reference for writing technical style. If you’re looking for a foundational reference for technical writing, the Microsoft Style Guide’s extensive, comprehensive index covers a broad swath of tech-related concepts, from text formatting and developer content to accessibility guidelines and word choice recommendations.


Microsoft’s Brand Voice

The Microsoft guide defines voice as “the interplay of personality, substance, tone, and style”: a holistic, amorphous definition that does well to capture the multifaceted yet singular mode that is the voice. Microsoft’s voice has three general principles to make sure it is well understood by its readers: 

  • warm and relaxed: the Microsoft voice is natural and akin to everyday conversations.
  • crisp and clear: the Microsoft voice gets straight to the point, leaving out unnecessary details.
  • ready to lend a hand: the Microsoft voice is always geared toward helping customers.

Following these guidelines involve a bit more planning in the writing process. Texts written in the Microsoft voice must be clear and legible; crucial pieces of information must be placed in strategic places to aid comprehension, and excess information must be pruned. Jargon and acronyms should be replaced or eliminated, and longer sentences should be broken down.

In that sense, the Microsoft Style Guide voice is more of a methodology of clear writing than it is a voice. In other words, the style guide is productive of a clear, crisp voice that everyone understands. 


Top 10 Tips for the Microsoft Style

Here is a list of the top 10 tips for writing and thinking in the Microsoft style and voice, as provided by the guide:

  1. Use bigger ideas, fewer words
    • Shorter is always better for the Microsoft voice, bordering on minimalist writing. 
  2. Write like you speak
    • Microsoft urges you to read your text out loud and examine whether or not it sounds like something a real person would say. 
    • There is a tendency for technical writers to adopt a robotic tone and diction, and it’s important to avoid this and pursue a more human voice.
  3. Project friendliness
    • Unlike other academic or professional writing styles, Microsoft not only allows but endorses the use of contractions to make your writing sound more verbal and friendlier.
  4. Get to the point fast
    • “Front-load keywords for scanning,” says the guide, in line with the popular writing principle of the inverted pyramid. Important things first, then details.
  5. Be brief
    • Microsoft disapproves of lengthy, dawdling texts in favor of shorter, precise texts. “Prune every excess word,” the guide goes on to say.
  6. When in doubt, don’t capitalize
    • Capitalization is an important stylistic issue that plagues writers. For the sake of legibility and comprehension, the Microsoft Style Guide suggests defaulting to sentence-style capitalization: only capitalizing the first word of a sentence, even headings. “Never Use Title Capitalization (Like This). Never Ever,” says the guide.
  7. Skip periods (and : ! ?)
    • For titles, headings, subheads, UI titles, and items in a list that are three words or fewer, periods are better left off. 
  8. Remember the last comma
    • Microsoft is an avid fan of the Oxford comma—the comma that comes before the conjunction. The Oxford comma is the preferred choice of American writing styles.
  9. Don’t be spacey
    • Spaces can elongate the visual space of sentences, so they should be used only when necessary. Only one space after periods, question marks, and colons. No spaces around dashes.
  10. Revise weak writing
    • This is a general rule of thumb for any kind of writing. Weak writing is ineffective in any setting. This can range from eliminating frequently used yet unnecessary phrases—you can, there is, there are, etc.—to specific word choices and verb tenses/modes. 
    • Weak writing can also pertain to the structure of your writing. Try planning out your writing beforehand, using diagrams or flow charts. After you finish writing, don’t be afraid to move around your paragraphs, cut down on unnecessary parts, and rephrase sentences. Most of all, don’t be afraid to rewrite what you’ve written. It can only improve your writing.

These 10 tips provide a general overview of the Microsoft writing style. It’s markedly different from other writing styles; for example, title capitalization remains the default mode of titling in many modes of writing, and contractions are generally avoided in formal writing. Except for a few of these peculiarities, the Microsoft Style Guide is a solid reference for good writing, applicable to a wide array of fields and not just the technology sector. 



In a sense, the Microsoft Style Guide—with its sections on accessibility guidelines, jargon elimination, and simple word choices—is a democratic, inclusive mode of writing that seeks to nurture communication. This is different from, say, academic writing, which has the opposite effect: it gatekeeps certain populations from accessing information with its dense, obfuscated sentences and words. In the same way technological advances break down language barriers, the Microsoft voice signals the era of information democratization. If this has never occurred to you, that’s completely okay. It could be that we are already used to the Microsoft voice; its PCs are an extension of our lives at this point.

Come back for the next part of our Microsoft Style Guide special; we will be covering more specific guidelines and terminology regarding accessibility, technology, and formatting. We hope you found today’s blog post to be informative. If you’re curious about how Sprok DTS uses the Microsoft Style Guide in its translation and localization, visit our website today and take a look at the wide variety of language services we provide.

Our translators and localization experts here at Sprok DTS are knowledgeable in various styles of writing, the Microsoft Style Guide included. Ask for a free quote for your next translation or localization project on our website.


Pioneering Low-Resource Language Translation with NeuralSpace

A couple weeks ago, we posted a news update about the London-based NeuralSpace raising 1.7 million USD in a seed round. A SaaS (Software as a Service) platform offering groundbreaking applications for 87 languages, many of which are low-resource languages, NeuralSpace allows its customers to train, deploy, and use their AI models for language processing applications without having prior knowledge of machine learning. This is useful for companies who need straightforward NLP solutions for their websites or products—especially enterprises working in low-resource languages. 

NeuralSpace’s CEO Felix Laumann recently sat down with SlatorPod’s hosts—Esther Bond and Florian Faes—to speak about the history of his company and share some insights about the services it offers. In a landscape that’s unforgiving for low-resource languages, it’s important to hear the stories of pioneers who shy away from mainstream high-resource languages (English, FIGS) and cater directly to a market for low-resource languages.


The Birth of NeuralSpace

With a strong background in computer engineering, mechanical engineering, and statistics, Laumann has always been fascinated by the mathematical models that power language technology. Laumann also mentions his experiences and interests in foreign languages, noting the beautiful differences between them and the important details that often go unseen in low-resource languages. Specifically, he mentions the way Indonesian has five different words for the word “ocean,” and how beautifully, markedly different the Tamil and Tibetan scripts are from the English alphabet. 

Given all this, Laumann partnered up with some colleagues and founded NeuralSpace three years ago. Driven by and sharing a passion for low-resource languages, NeuralSpace’s initial members focused their attention on providing NLP solutions specifically for low-resource languages, despite the small amount of data available. 

Laumann notes that the data collection process is abysmal for low-resource languages. A low-resource language, by definition, is a language without a substantial dataset available for analysis and training, comprising less than 1% of the internet’s written material. Data is mostly taken from the internet—it being the most freely available source of written material—but NeuralSpace also works with data acquisition companies who help collect spoken and written language data for low-resource languages. In some cases, NeuralSpace works directly with linguists to scribe their own data, asking them to translate sentences line by line. What’s important is that the dataset covers everyday words and phrases—or in Laumann’s words, kitchen conversations. 

It’s a noble cause it’s working for, and begs the question of how the company came into such a great amount of funding. According to Laumann, raising capital for the company was a relatively straightforward process; NeuralSpace was parts of an accelerated program that allowed for a facilitated process in the funding process. Laumann recalls pitching his company’s ideas and missions at various online and offline events around London, where he is based. 


NeuralSpace, Now & Later

Three years after its founding, NeuralSpace now is a popular SaaS option for chatbot or conversational AI development companies, although the actual use cases vary widely. Given NeuralSpace’s unique position in the market as a provider of low-resource language solutions, customers look to NeuralSpace as an efficient, effective alternative or complement to frontend translation engines such as Google Translate. 

Where does NeuralSpace see itself in the future? Laumann talks in depth about his aspirations for the company; his current interests lie primarily in voice-to-voice live translation for low-resource language pairs, which is something Google Translate has yet to do. But it’s an arduous task, says Laumann; voice-to-voice live translation requires near-perfect text-to-speech, text-to-text translations, working near-perfectly in tandem. But voice-to-voice translation is a highly sought-after function in the modern global language market: any company that cracks it is bound to rise to the top. Building on this, Laumann hopes that NeuralSpace can cover any fundamental NLP project or problem—a one-stop solution to any NLP needs a company might have. To do this, he hopes to make NeuralSpace’s NLP functionalities more customizable, offering a wide array of solutions that can be fitted together to provide optimal NLP interfaces for companies with varying needs.

NeuralSpace’s mission boils down to a simple cause: to “democratize NLP and make sure any developer can create software with advanced language processing in any language and not just English.” This mission highlights two important, virulent problems plaguing NLP: the obscurity of NLP concepts like BERT, Lemmatization, or Tokenization and the subsequent lack of machine learning knowledge which inhibits training, deploying, and scaling NLP models; and the lack of NLP solutions in low-resource languages spoken in major parts of the world. 


How Does NeuralSpace Do It?

None of this answers a question that, at this point, lies at the heart of NeuralSpace. How on earth does NeuralSpace deal with the challenges that low-resource languages face? How does the company cope with the lack of annotated or unlabeled datasets, or the myriad of dialects present in some wider language families? There are clear, surefire processes that NeuralSpace employs to deal with these problems systematically and fundamentally. 

First is transfer learning, or leveraging prior knowledge to solve new tasks—the way humans do. NeuralSpace’s applications are based on language models that are highly adaptable, even in low-resource settings. These language models do not require annotated data and generate language abilities via unsupervised learning (as compared to supervised learning with bilateral and bilingual corpora, which is the predominant method for high-resource languages). Despite the possibilities that unsupervised, adaptable models offer, they aren’t exactly useful for specific tasks like “classifying user intents off-the-shelf,” so NeuralSpace fine-tunes these models to the point where they can solve user-specific tasks with limited amounts of data. In the process, the models learn to solve tasks accurately despite data scarcity—all through transfer learning. 

Laumann gives an example of this in a blog post about low-resource language models:

…let’s take the case of an e-commerce chatbot. The chatbot is supposed to answer queries and resolve customer issues around delivery time, refunds and product specifications. To form a conversation, the chatbot must first understand the intent of the customer, then a few entities. For example, “Where is my Razer Blade 14 that I ordered on the 4th of December?”, whose intent is classified as “check order status” with the entities “laptop”: “Razer Blade 14” and “date”: ”4th of December”. Thus, we will need a simple intent classification model.

The problem, however, is that there are hundreds of intents and corresponding actions that follow such intents. Laumann confesses that it’s “expensive to annotate huge training datasets and time-taking to train a well-performing model from scratch.” But with transfer learning and fine-tuning, NeuralSpace’s models can give accurate solutions with fewer inputs, thereby saving data annotation costs and gaining model performance. The best part, says Laumann, is that “developers do not even need to think about transfer learning” because NeuralSpace’s optimization algorithm (AutoNLP) takes care of everything all on its own. 

NeuralSpace’s second answer to low-resource language processing is multilingual learning, in which a single model is trained on multiple languages. Multilingual learning works off of the assumption that “the model will learn representations that are very similar for similar words and sentences of different languages.” For example, NeuralSpace’s language model can transfer knowledge from a high-resource language (e.g. English) to a low-resource language (e.g. Swahili) via transfer learning, utilizing similarities between the languages. This process is much easier to scale and requires less storage, claims Laumann, and allows the model to upgrade to better architectures much easier. 

NeuralSpace’s multilingual models result in higher performance across languages and also allows NLP models to make generalizations and inferences on languages it has not fine-tuned. Laumann gives the example of Tamil, which can be predicted with high accuracy using previously trained English, Hindi, and Marathi data. Finally, multilingual models are simply cheaper to host—only one model is necessary for numerous languages, instead of one model for each and every single language. 

The third and last answer to low-resource language processing is data augmentation, which is a “data pre-processing strategy that automatically creates new data without collecting it explicitly.” By synthesizing various iterations of the same sentence, a model can diversity its own training data in a cheap, fast, and unsupervised manner. NeuralSpace’s own data augmentation application, called NeuralAug, allows models to switch out words, word order, and translations to enrich the dataset of any given low-resource language, enhancing the overall robustness of the model. 



These are fundamental, crucial steps that NeuralSpace takes to ensure that its offering of low-resource languages are effective and accurate. This isn’t to say that low-resource language processing is nearing perfection; much work has yet to be done to render it more commercially applicable. However, the existence of NeuralSpace in the language industry landscape—a voice of reason calling for the democratization of NLP—is meaningful in that it sheds light on the inherent biases of English- and FIGS-centric language models and offers a more level playing field for low-resource languages. 

In that sense, NeuralSpace is true to Laumann’s old fascination with the beauty of languages—languages that are rarely read or heard in a cyberspace dominated by major languages. To appreciate the beauty of language is to celebrate the diversity of spoken and written languages, using them not only in local and ceremonial settings but also in hard, cold industry settings. NeuralSpace’s efforts to implement low-resource language application in business is but the start of the rise of low-resource languages; English is counting down its years as the reigning lingua franca of the internet. 


Sprok DTS is dedicated to providing fast, accurate, and professional localization and translation for all your language needs. Our services come in 72 languages, including high-resource ones such as English, Spanish, and Japanese, and low-resource ones, such as Kyrgyz, Galician, and Lao. Powered by the world’s leading neural machine translation technology, our localization experts and translators ensure speed and accuracy on all your projects. Ask for a free quote today on our website. 




Google Area 120 Introduces Aloud, an Instantaneous Dubbing Program

By now, most of us are accustomed to auto-generated subtitles on YouTube videos. Some of us might miss the good old days of community captions and wonder why that feature ever left the platform (this post on Data Horde is required reading for anyone interested in the issue.) 

But it’s 2022, and instead of subtitles, we now have dubs. From Google’s Area 120 incubator—a testing ground for up-and-coming applications—comes Aloud, a program that allows content creators to quickly and easily dub their videos into multiple languages. The product has yet to be released, but it already performs exceptionally well dubbing English videos to Spanish, Portuguese, Hindi, and Indonesian. 


Why Aloud?

The founders of Aloud—Buddhika Kottahachchi and Sasakthi Abeysinghe—hope to make dubbing a more affordable practice and to bridge communication across languages. People on the other side of the globe can enjoy a content creator’s videos without having the learn the language. It’s important to note, however, that dubbing, as an aural format, doesn’t cater to deaf people enjoying YouTube videos (a whole important issue in itself), but it will allow other previously disenfranchised members—blind people, for example—to enjoy international content. 

Kottahachchi and Abeysinghe mention how they were inspired by their childhoods in Sri Lanka, learning to read and understand English to learn more about the world. Their friends who did not pick up English, however, had a much harder time connecting with the world. Adding to this is the sheer number of global viewers who use video to learn (46%) and the number of people who use English (24%). The founders also mention major differences between subtitles and dubbing: subtitles are not ideal on mobile devices, require constant attention to the screen, and can be hard to read for people with visual or reading impairments. 


How does it work?

Aloud uses “advances in audio separation, machine translation and speech synthesis” to reduce time spent on dubbing, translation, video editing, and audio production, allowing content creators to dub their videos without much effort or resources. Creators have the option of uploading a transcription to be used for the dub or using generated text transcriptions as a base for dubs. Aloud’s founders plan to release the program free of cost. 

YouTube has yet to allow for multiple audio tracks for its videos, but the company is currently testing it; once multiple tracks are available, viewers will be able to switch to Aloud-generated dubs in their language of choice. For now, it seems as if creators using Aloud will have to post dubbed videos separately, as is the case with this English video translated into Spanish and Portuguese. 


How does it feel?

Aloud’s website offers a few sample videos that showcase the product’s application on actual YouTube videos, and as to be expected, the voice is that classical artificial monotone we hear everywhere. It’s more fluent-sounding than some others we’ve heard before, but one can’t shake the feeling that this computerized intonation will, at one point, tire us out. There’s a certain humanness it lacks; halfway into a sample video, we find ourselves yearning for a human dub. 

Human voice actors, for one, can convey emotions and verbal nuances that accompany the language. If it’s a video on why graves are dug 6 feet into the earth, then the dub should reflect the somber, horrific nature of the video. If it’s a lesson on biology aimed at children, then a factual tone and intonation should accompany the video—matching the slow speed of the original audio. For audiences that are sensitive to these nuances, Aloud isn’t the most effective tool. 

For creators that have tried out Aloud, they seem positive about the new audiences they can reach with so little effort. Kings and Generals, a channel that creates and uploads animated historical documentaries and boasts 2.4 million subscribers, says that their “audience loved it,” and that they are “looking forward to trying this many more times.” The Amoeba Sisters, a 1.3-million-subscriber-big channel with fun science videos, say that “this tool was easy to use and so convenient,” and that they “are so grateful to have a way to reach more audiences through dubbed videos.” 

Creators, small- or big-time, will gladly accept Aloud as a new way of diversifying their audience; the untapped market in video content is much bigger than the current English-speaking market many creators appeal to. 


The verdict

Without metadata (tone, intonation, voice) taken into consideration, it’s hard to say how effective Aloud will be at conveying and translating the fullness of the original video. As of now, Aloud sounds like your average, run-of-the-mill frontend AI narrator. Theoretically, any content creator can transcribe their video, translate it, run it through a translator, record the output, and use that as the new audio for a translated video.

But that process will take hours, if not days, to complete. What’s important about Aloud is that it has streamlined the process for individual users and creators to take control of the dubbing process so that they can make their content more available and accessible to non-English speakers, despite the reduction in tonal, aural qualities. In that aspect, Aloud’s mission is a valuable one, capable of affecting the lives of hundreds of millions across the world who don’t have access to English videos. 


How does this relate to me?

If you’re a translator or voice actor and are constantly on the lookout for the next big development in AI that will topple your (or should we say, our) career in the language industry, Aloud might not be the most welcoming news. With up-and-coming companies like DeepDub threatening to take away the livelihoods of many a voice actor, Aloud is just one more product to worry about. 

However, as we’ve mentioned before in this blog, the limitations of automated dubbing and translation are manifold. Even if automated dubs become feasible, translators will still be needed, for example, to check or carry out translations before the text is fed into the dubbing program. Considering that it will take a long time before automated dubbing voices reach human parity, voice actors also shouldn’t have to worry about being replaced; these products aren’t going to be dubbing movies or TV shows anytime soon. 

If you’re a creator, it will be beneficial to take a moment and reflect on what programs like Aloud really mean for the language industry and video content ecosystem. How will viewers react to videos made with automated dubs? Will this help or hinder my channel’s reach? Could this possibly affect translators, voice actors, and other professionals working in the industry? How are their translations and dubs different from the ones provided by products like Aloud?

It will also be helpful to think about what programs like Aloud leave out. For example, those who are deaf or hard of hearing will not benefit from these products. If we are truly aiming for an accessible ecosystem, then subtitling products will have to accompany dubbing products. 

These are some things to think about as we move forward into the uncharted realm of automated video augmentation; no matter the benefits and shortcomings of artificial intelligence and machine translation, we must make sure no one is left behind in the wake of development. 




Friend or Foe: Machine Translation Implementation in Literature

In her lecture, “The World as India,” Susan Sontag notes that “translation is the circulatory system of the world’s literatures.” The translation she refers to here is literary translation—that branch of translation so often revered as the highest among its lesser, commercial peers such as technical translation, legal translation, etc. José Saramago has a similar claim: “writers make national literature, while translators make universal literature.” 

Why is it that people venerate literary translation, and what marks it different from other, more lucrative forms of translation? For starters, literary translation has always been associated with a certain sense of deliberation and contemplation, akin to creative writing and its highbrow superiority over technical and commercial writing. While the last decade has seen creativity play a bigger role in commercial sectors (e.g. transcreation, localization), the prestige they enjoy comes nowhere close to what literary translation enjoys. 

But each day, the gap between literary translation and non-literary translation closes ever more slightly; a large part of it is due to machine translation. With the advent of MT technology, translation has been demoted from the echelons of professionalism to an everyday chore: everybody can translate with the help of Google Translate (or some other generic engine). Anyone can utilize machine translation to achieve better translation results than ever before.

In the fourth issue of Counterpoint, the official online publication of the European Council of Literary Translators’ Associations (CEATL), president Morten Visby remarks that the threat of machine translation is a real one. Although there still remains for literary translators a “strong public perception of our cultural value as literary creators” and hence publishers are still wary of utilizing machine translation to cut costs, Visby claims that “literary translators may find themselves in an even more precarious position than their current one.” Machine translation already proffers plausible, near-human translations of contemporary genre literature—romance, thriller, mystery—between closely related languages, such as the Nordic or Romance languages.

In the same issue, other literary translators argue the opposite. A main proponent of human infallibility in translation is James Hadley, assistant professor in literary translation at Trinity College Dublin, who points out several crucial differences between human and machine translation in literature. For one, machine translation as of now only performs well in a specific domain in which the machine was trained; a machine trained with technical corpora will give better technical translations than, say, cookbook translations. On the other hand, Hadley writes that “in literature, not only are the writing conventions substantially different from many technical texts, [but] these conventions differ substantially between authors, time periods, genres, and forms of literature.” In other words, machines are not yet nuanced or robust enough to differentiate between the multivarious domains present within literature, and as a result, cannot provide fitting translations for literature. 

Another difference is that machine translation still operates on the word- and sentence-level, incapable of contextualization and chapter- (or book-) level translation, meaning sentences are translated in isolation. “But for literature, where ideas, metaphors, allusions and images can be recalled sentences, paragraphs, or even chapters later,” says Hadley, “the machines have a long way to go before they will be able to approach the skills of a human literary translator.” 

That’s not to say machines have no place in literary translation. For Hadley, machines do offer some help to translators, such as giving human translators “key details about the source text at a glance, which will allow them to work as efficiently as possible.” As such, Hadley notes that “developers [are] working on tools specifically to help literary translators.” The machine is an aid to human translation, not a replacement or alternative. 

Other translators in this issue of Counterpoint have similar opinions on this matter. UCD lecturer in German Hans-Christian Oeser delves deeper into this issue, speaking of his experiences in both human translation and post-editing. Oeser participated in a 2018 experiment comparing post-editing and human translation, the source text being F. Scott Fitzgerald’s The Beautiful and Damned; after completing both types of translation, Oeser found that “[his] “textual voice” was somewhat diminished in [his] post-edited work compared to its stronger manifestation in [his] earlier machine-independent German version.” 

Oeser isn’t completely disillusioned by machine translation; he notes that “in terms of the time spent on post-editing, as opposed to translating from scratch, it could be argued that the overall effort is somewhat less time-consuming.” Plus, Oeser admits that there is a degree of comfort that comes with knowing that the work of translating has already been carried out to a certain extent. 

But like Hadley, Oeser is not easily convinced. “The machine has, as of yet, no proper sense of context, of wordplay, ambiguity, polysemy, and metaphor or of rhetorical devices such as alliteration and assonance,” he writes, “it frequently mistranslates, using inappropriate words and phrases, seemingly chosen at random from its vast lexicon.” Even worse is the machine’s inability to maintain awareness of “elegance, of beauty, of stylistic coherence (or indeed intended breach of style),” and thus is devoid of that characteristic “sound” that all translators write with. 

In the end, Oeser comes to the conclusion as Hadley: “I would propose that every literary translator out to have the possibility and the right to utilise every tool at their disposal,” including digital dictionaries, translation memory and terminology management software, and online translation programs “of every description.” Machines can’t yet replace us, but they can aid us. As for the malicious publishing practice of utilizing post-editing as a means to cut costs, Oeser is vehemently against it and attempts to proselytize his passion against such practices, saying “we will have to be the Luddites of the humanities!” 

This threat of widespread post-editing practices is already happening around the world; complete machine translation has yet to happen, but MT developments are already wresting control away from human translators via post-editing. An assistant professor of literary translation at the University of Vienna, Waltraud Kolb is worried about the future; she anticipates “increasing pressure from publishers to cut costs this way, even though post-editing may be as cognitively demanding and time-consuming as translation.”

Kolb has previously carried out research on the effectiveness of high-quality post-editing, and found that it is “not much faster [than normal human translation]… the post-editors changed 90% of the sentences [of the source text] – the main problems the machine had were cohesion, reference, idioms, polysemy.” But publishing houses are likely to pay post-editors much less, given that the brunt of the “original” translation has been accomplished by a machine. 

With increasing usage of post-editing comes subsequent problems: Kolb predicts that wider post-editing practices could signify a near future in which “language will eventually become more uniform.” There’s also a concern for a decrease in general quality of writing—as Oeser notes—as well as copyright issues, which is a prevalent concern across all fields of translation. 

Post-editing—and the subsequent erasure of the translator’s role—is something Lawrence Venuti would have feared. A prominent translator and theorist, Venuti is known for his 1995 book, The Translator’s Invisibility, in which he argues against domesticating translations: a practice which effaces the translator’s in an attempt to make the translation sound more fluent in the target language. According to Venuti, the translator plays an equally important part in the novel’s translation process as the author, so much so that one could argue that a translator’s work is an original work in itself. 

In the case of post-editing, translators are rendered invisible not by the fluency with which they carry over the writer’s original voice; rather, translators are effaced by the monotone, unimodal voice that is the translation engine. Of course, different engines might sound different, but none of them will be able to offer, anytime soon, a nuanced, varied voice like the ones professional literary translators write with. 

This isn’t exactly in the vein of Venuti’s writing, but the logic applies; the translator has always held a precarious position, subject to erasure, deemed inferior to the authorship that looms over the entire text. These are practices that we should actively be moving away from; post-editing seems to be a regression back to the dark ages of translation which, arguably, we are still going through. This post on Book Riot by writer and editor Leah Rachen von Essen reveals that many major writers outside of the Anglosphere hide or erase their translators’ names when publishing their English translations. These best-selling writers include: Elena Ferrante, Haruki Murakami, Stieg Larsson, Cornelia Funke, and Mieko Kawakami. 

In short, a major concern to be addressed with the tide of post-editing and machine translation is the translator’s voice and agency: how do these new pieces of technology aid the translator in their job, physically and status-wise? Who do these developments in MT help, and at what cost?




Automated Linguistic Evaluations and Recent Acquisition News

Emmersion and the Rise of AI-Powered Language Evaluation

In a 2020 report, staff writers of the Customer Contact Week give four reasons why language testing for companies—especially ones like contact centers—is a challenging, strenuous process for which companies don’t readily have solutions. It’s clear that high-quality communication skills are imperative for any linguist (or any employee in the language industry, for that matter), but previous methods of linguistic evaluation and hiring aren’t as accurate as people think they are. “Existing, manual measures tend to be as inefficient as they are ineffective,” says CCW, “yielding recruiting processes that are costly and time-consuming without even offering the rewards of better agent performance or retention.”

There are four specific reasons why existing language evaluation tests are inefficient, according to CCW:

Conversational testing is vulnerable to significant bias and subjectivity. An in-person recruiter may mistake a candidate’s natural charisma for language competency, leading them to hire people who will ultimately struggle to develop product knowledge and support customers.

Manual testing can be a time-consuming process, which is a luxury many contact center recruiters do not have. Some are responsible for immediately hiring a mass of agents to meet an internal or third-party need. Many, moreover, recruit from a talent pool that expects an “on-the-spot” offer following a successful interview.

As they assess “academic” language capabilities, traditional language testing may not sufficiently inform an agent’s ability to engage in natural conversations and deliver stellar experiences.

Often broad and static in nature, traditional language tests may not sufficiently determine whether an agent can handle a company’s specific issues or support a particular demographic of customers.

In short, current methods for evaluating language proficiency aren’t simply robust enough to fully evaluate a subject’s multifaceted linguistic capabilities. This hurts the company in both their customer service and training costs; if a company mistakenly hires an unqualified linguist, the company either has to train them or suffer a blow to their customer experience. 

This was before COVID-19 when offline, face-to-face interaction was the norm. Back then, companies could at least help train unqualified agents in person; now, workers are spread far and apart and customer demand for online communication never seems to stop increasing. CCW gives four additional challenges to language evaluation that COVID-19 poses:

Dealing with dramatic changes in volume and new customer expectations, some companies need to quickly increase their headcounts. Infamously time-consuming, traditional language tests prevent companies from rapidly scaling their recruiting efforts.

Already subjective and ineffective when conducted as part of face-to-face meetings, language tests can be even harder to execute during remote video interviews.

With COVID-19 increasing omnichannel communication, agents require the ability to successfully communicate via voice and text. Language testing, therefore, needs to evolve to assess competency in all communication channels.

With fewer opportunities for face-to-face guidance, new agents will have to independently develop knowledge and perfect their customer engagement skills. They will also have fewer opportunities to ask their peers or supervisors for help during interactions. Strong language skills are essential for succeeding in this more autonomous landscape.

Emmersion offers a solution to these problems with its speaking and writing evaluation tests. Harnessing the power of artificial intelligence, Emmersion’s tests are adaptive and efficient, taking into account the varieties in people’s dialects and the linguistic differences between academia, the workplace, and everyday conversations. 

In a recent podcast hosted by Slator’s Esther Bond and Florian Faes, Emmersion CEO Brigham goes into more detail on how Emmersion’s products work. One particularity is Emmersion’s very own AI assessment engine, which not only offers adaptive language evaluation tests but also gives full, detailed reports on the tester’s language skills. The whole testing process takes around 15 minutes or so, minimizing time and manual resources spent on language evaluation. 

Like other language evaluation programs, Emmersion’s AI assessment engine does use general front-end services (such as those offered by Google) to run an initial speech recognition, analyzing the speaker’s words for pronunciation. Emmersion, however, goes above and beyond, also analyzing for language abilities, grammar, and a plethora of other categories, thus compiling a more nuanced, thorough picture of a person’s linguistic abilities. 

The 2020 CCW report notes that Emmersion’s innovative products will effectively yield better customer service, providing six specific benefits that Emmersion brings to the automated language evaluation field: 

AI language testing solutions, such as the TrueNorth Speaking Test by Emmersion, can quickly, accurately and objectively assess speaking ability.

Tests, which do not need to be administered by an in-person “proctor,” account for candidates’ ability to process conversations, repeat and rephrase information, and confidently answer open-ended questions.

The AI-driven tests can be adaptive, tailoring prompts based on the candidate’s background and company’s needs.

The robust, instant scoring system accounts for various nuances of a candidates’ speaking ability, including difficulty of vocabulary, repetitive use of phrases, pauses, and quality of sounds and words.

In addition to providing a more accurate assessment, this robust evaluation helps recruiters understand the pros and cons of each candidate. With this information, companies in a staffing crunch can intelligently relax certain standards while still ensuring they are hiring agents with enough competency to perform pivotal tasks.

The nuanced scoring can also help companies make accurate predictions about a given agents’ long-term success and happiness on the job, leading to higher retention rates and thus more consistent customer experiences.

CEO Brigham Tomco claims that taking the recruiter out of the equation could save at least 30 minutes per applicant, totaling up to over 2,500 saved per year for recruiters. In turn, these hours saved translates to tens or hundreds of thousands of dollars saved. 

These are trends, CCW posits, that are here to stay. With the advent of automated language evaluations, “operational productivity will rise, internal engagement will increase, and customer happiness will skyrocket,” according to CCW authors, and Emmersion’s products and results are proof that these practices work. With 750 clients—academic institutions and business enterprises alike—under their belt, Emmersion signals a new paradigm for language evaluation in a world where artificial intelligence is utilized to complement and improve the human experience. 


Transcription Company Verbit Acquires UK-Based Take Note

Mergers and acquisitions are an increasingly common trend in the language industry; Amplexor’s 2021 merger with Acolad was one of the biggest events of the industry last year. The process—M&A for short—has a long history, dating as far back as 1784 when the Italian banks Monte dei Paschi and Monte Pio merged as the Monti Reuniti, and 1821 when the Hudson’s Bay Company merged with its rival North West Company. The result, almost always, is a movement towards market domination and risk reduction—which, in other words, signifies improvements to financial performance alongside a number of other monetary benefits. 

Such is the case for Verbit, a New York-based multilingual transcription company. Its acquisition of Take Note—a similar transcription provider, but based in the UK—takes Verbit one step closer to its “goal of becoming a one-stop-shop for all voice AI needs.” But the acquisition wasn’t just out of the blue; Take Note specializes in market research, and in acquiring the company, Verbit can expand its “corporate services portfolio” for its market research customers, all the while enhancing its usage of UK English. Slator notes that this acquisition will bring “a degree of consolidation to the fragmented transcription market.” 

Take Note is the latest acquisition in Verbit’s widespread expansion in the transcription industry. Slator’s Esther Bond tracks Verbit’s history of acquisitions: “the company bought media captioning company, VITAC, in May 2021, and government and education-focused captioning provider, Automated Sync Technologies (AST) in December 2021.” To Bond’s question of how the company will develop specific automation solutions, Verbit CEO Tom Livne responded that he sees “a wide range of use cases for our Voice AI solution” in the field of market research. 




XTM Returns with a Bang and Other Updates from the Language Industry

2022 Translation Technology Summit to Take Place in Silicon Valley

After two years of virtual conferences, the Translation Technology Summit (XTM)—one of the world’s biggest localization technology events—resumes in-person events with XTM LIVE 2022. The conference will take place in Silicon Valley on April 27th and 28th; the venue is the renowned Grand Bay Hotel San Francisco. Some of the language industry’s most prominent and innovative leaders will attend to discuss and debate trends, issues, and challenges in global communication. 

The keynote speech will be delivered by Richard Yonck, the famous futurist and consultant known for his books Future Minds and Heart of the Machine about artificial intelligence and its place in the world today and tomorrow. Speaking alongside Yonck is Joel Sahleen, the Director of International Engineering at Domo, which is a US-based business intelligence and analytics platform; Sahleen will speak about data-driven strategies and Domo’s experience with localization. 

Other speakers include Yuka Kurihara from Scaled Agile with her presentation on how “coupling agility with scalability can help businesses adjust to demand without their money or reputation taking a hit.” There’s also Sergey Parievsky from Juniper Networks and Yasmin Vanja from Sony who will remark on their experience in the language technology industry. Other prominent speakers include Talia Baruch (GlobalSaké CEO Cofounder), Rocio Gray (Localization Manager at Crown Equipment Corporation), Rafał Jaworski (Linguistic AI Expert at XTM International), Andreas Merz (Translation and Terminology Specialist EMEA at Crown Gabelstapler GmbH & Co. KG), and others. 

Tickets for full conference attendance are at 900 USD, but the Super Early Bird Special price is three-fourths the amount. The same special prices apply to single-day attendees, with the first day selling for 400 USD (300 USD with the discount) and tickets for the second day currently at 600 USD (450 USD with the discount). 


Waverly Labs Announces New Services and Hardware at This Year’s CES

Waverly Labs, developer of instant-translation earpiece devices, has recently announced a whole line of new services and products. There is Subtitles: a “two-sided customer service counter display facilitating hygienic, safe in-person interactions. There is also Audience: a “translation solution allowing speakers and lecturers to communicate with auditorium attendees.” Finally, Waverly Labs announced an updated version of their hit product Ambassador Interpreter: the “over-the-ear device providing near-simultaneous audio and text translations.”



Subtitles is a two-sided monitor embedded in a glass screen, providing simultaneous translation while maintaining COVID-19 protocols. After selecting a language on the touchscreen, users can talk naturally; meanwhile, the conversation is translated real-time and displayed as text on the other person’s screen in their chosen language. “The proprietary technology allows users to communicate like watching a subtitled movie and reading what the other person said,” says Waverly Labs. Subtitles is not only useful for translation, but also for transcription for customers with hearing impairments. 

The device currently supports the following languages: English, Mandarin, Cantonese, Spanish, Arabic, Korean, French, German, Italian, Portuguese, Greek, Russian, Hindi, Turkish, Polish, Japanese, Hebrew, Thai, Vietnamese, Dutch. It also supports 15 Arabic dialects, 20 Spanish dialects, 9 English dialects, as well as the regional dialects of French, Portuguese, and Chinese. The device will be available in the second quarter of 2022. 



Audience isn’t a device, but a translation solution particularly useful for “lectures, educators, and conference event organizers,” as it allows presenters to have their words translated to the audience in their respective languages. The process is simple: the speaker’s words are captured by the mic and then processed and translated in Waverly’s own cloud infrastructure. Audience members can then listen to a translated version of the speech—in their choice of languages—on their own phones or tablets. Audience is expected to be available in the second quarter of 2022 as well and supports the same languages and dialects as Subtitles. 

Image credits: WaverlyLabs


The Economist’s Analysis of Spotify Data & the Decline of English

In an effort to investigate global music tastes and their evolution through time, researchers at The Economist gathered data on the top 100 tracks in 70 countries, examining over 13,000 hit songs in 70 languages alongside other data such as genre, lyrical language, and artist nationality. In doing so, The Economist hoped to group countries according to “musical similarity.” As for their research methodology and results, here’s what they say:

On these 320,000 records, we employed a principal-components analysis to assess the degree of musical kinship between countries, and then a clustering algorithm (known as k-means) to group them. Three broad clusters emerged: a contingent in which English is dominant; a Spanish-language ecosystem; and a third group that mostly enjoys local songs in various tongues. Across all, one trend emerged: the hegemony of English is in decline.

In the Spanish “cluster,” the percentage of English-language songs fell from 25% to 14%, a phenomenon The Economist attributes to artists such as Bad Bunny and Rauw Alejandro becoming internationally popular. Some other language clusters—especially Brazil, France, and Japan—have experienced even steeper declines in English-language songs: 52% to 30%. 

The article does note that “despite its decline, English is still king. Of the 50 most-streamed tracks on Spotify over the past five years, 47 were in English.” However, it’s a hegemony that’s soon to be challenged by the rise of hit songs in languages other than English. With more linguistic variety in the global music scene comes a greater need for translation—people want to understand what they’re hearing and enjoying. The trend of international, multilingual music will counterintuitively bring people together, not set them apart, aided by the tools of translation freely available to all. 



Elden Ring Message Mistranslations & Machine Translation Errors

The creators of the fiendishly difficult Dark Souls franchise recently released a new game, Elden Ring, which is just as devilish in its difficulty and beautiful in gameplay. Many users were amused by the in-game message system, which allows players from all over the world to leave messages for other players; the messages (which are written by choosing preselected words in the system) are translated via machine translation to fit the language of the reader. 

According to reporter Iyane Agossah of DualShockers, however, many English-speaking Elden Ring players have utilized the messaging function to leave memes around the fantastical world of Elden Ring; by putting together the words “fort” and “night,” the players are able to spell out (phonetically) Fortnite, a popular multiplayer game that has become a meme sensation among younger generations and gamers. 

This became a problem, however, when Japanese players read these messages, post-translation, and the joke was lost in translation. Agossah explains how Japanese players “spent quite a lot of time looking for some special event triggered at night inside forts, only to finally get the joke and realize it was a pointless effort.” While not the machine translation’s fault, the episode does leave a comment about machine translation’s inability to comprehend memes and popular trends, and how this misunderstanding in translation leads to actual hours being spent and wasted by players who take these mistranslations for face value. 

A similar, but not as serious, episode happened vice versa, from Japanese to English, when the popular Japanese internet slang “草” (the equivalent of the English “lol”) was translated as “grass” in English. Times like these put to question the ability and feasibility of machine translation in real-life contexts. It’s one thing for a mistranslation to be amusing; it’s another for mistranslations to be misleading. As linguists, it’s important to be mindful of machine translation and its shortcomings before we rely on it too much. 





Google Pathways, Machine Translation, and Other Language Industry News

Introducing Google Pathways

Last October, Google Research’s senior vice president Jeff Dean introduced Pathways to the rest of the world. An AI architecture that can piece together previous knowledge to solve new tasks, Pathways is a work in progress whose purpose is to break artificial intelligence out of its cumbersome, inefficient shell. To some, it’s another new experiment Google AI is cooking up. But for others, it’s the next chapter in the story of artificial intelligence nearing the intuitive cerebral processes of the human brain. 

So what exactly does Pathways do? It’s a new AI architecture designed to address weaknesses in existing systems and synthesize their strengths. Current AI systems have a number of issues: they’re only designed for specific purposes in mind, meaning they’re pretty useless outside of their context. They’re also reliant on a single sense or mode of input—quite unlike humans, who employ five (or six?) senses to make sense of this world. Finally, they’re dense and inefficient, requiring way too much data and energy for the smallest task. Here’s how Pathways plans to solve these issues, one by one. 


Problem 1: Current AI models are trained to do only one thing.

Existing AI systems today are often built from scratch, designed to solve one problem in mind. Dean compares this to our childhood experience of learning to jump rope: “Imagine if, every time you learned a new skill (jumping rope, for example), you forgot everything you’d learned – how to balance, how to leap, how to coordinate the movement of your hands – and started learning each new skill from nothing.” That is how developers and scientists train most machine learning models. 

Instead of improving current models so that they are more robust and take on new tasks, scientists build new models from scratch; it’s a tedious and time-consuming process. All that’s left are thousands of models for thousands of tasks. With this method, it takes longer for models to learn each task, as scientists teach the models to learn the world from scratch. This is completely different from how humans approach new tasks; humans apply their previous knowledge to new tasks, identifying parts of the task we already know so as to carry out the new task as efficiently as we can (if we want to, that is.)

Dean and his colleagues at Google propose Pathways as a remedy for this problem in AI; Pathways is a model that can “not only handle many separate tasks, but also draw upon and combine its existing skills to learn new tasks faster and more effectively.” For example, per Dean, a model that learns how to analyze aerial images to predict landscape elevations would be able to apply that knowledge to predicting how flood waters will flow through that landscape. “A bit closer to the way the mammalian brain generalizes across tasks,” explains Dean. 


Problem 2: Current AI models focus on one sense.

Another problem with existing AI systems is that they are oblivious to context and/or connected ideas. “Most of today’s models process just one modality of information at a time,” says Dean, “they can take in text, or images or speech – but typically not all three at once.” This differs from how humans take in information: by utilizing multiple senses at the same time to account for the multi-sensual nature of reality. 

Scientists hope to solve this problem through Pathways, teaching it to “encompass vision, auditory, and language understanding simultaneously.” Dean offers this enlightening metaphor of how this would work: 

So whether the model is processing the word “leopard,” the sound of someone saying “leopard,” or a video of a leopard running, the same response is activated internally: the concept of a leopard. The result is a model that’s more insightful and less prone to mistakes and biases.

Unlike previous models, Pathways would be able to handle more abstract forms of data, says Dean; such abilities of abstraction will allow scientists to deal with more complex systems. 


Problem 3: Current AI models are dense and inefficient.

Lastly, Dean points out that existing AI systems are “dense,” which is to say that, to accomplish a given task, a model usually activates the entire neural network, even if the task at hand is simple. This is markedly different from how humans deal with tasks; humans only utilize relevant pieces of information and activate corresponding parts of the brain to solve the situation. “There are close to a hundred billion neurons in your brain,” says Dean, “but you rely on a small fraction of them to interpret this sentence.” 

With Pathways, new AI models will be “sparsely” activated; in other words, only small amounts of the network will be utilized when needed. Through this process, AI models will dynamically learn to identify which part of the network is good at which task so that it can allocate certain parts that best fit the needs of a task. Such an architecture, Dean claims, is not only faster and more energy-efficient but also has a larger capacity to learn more kinds of tasks. As an example, Dean brings up GShard and Switch Transformer—two of the largest machine learning models—which already utilize sparse activation. The two models consume less than 1/10th of the energy normally required by similarly-sized models (dense ones); they sacrifice none of the accuracy, however. 

The main purpose of Pathways, Dean concludes, is to advance humans “from the era of single-purpose models that merely recognize patterns to one in which more general-purpose intelligent systems reflect a deeper understanding of our world and can adapt to new needs.” This last purpose—reflecting a deeper understanding of our world and adapting to new needs—is particularly important, says Dean, as it will help us in our attempts to fix impending global challenges, mainly ecological. New paradigms in AI modeling will help cut costs and energy, helping to build a more sustainable future in AI research. 

So how does any of this relate to translation, or perhaps the relevant fields of machine translation and natural language processing? As explained in our other article on Lokalise’s switch to carbon neutrality, training and using NLP models and machine translation systems consume a surprisingly large amount of energy. With more efficient NLP and MT systems in place, we can start to worry less about the environmental ramifications of our translation work. Efficiency is looking like the new alternative standard, as opposed to accuracy, and this trend toward greener futures is all the more applicable to the AI-intensive field of translation. 

Pathways is much like massively multilingual machine translation in the sense that multimodal systems are developed in favor of single-use systems designed for one purpose only. Developments in AI now allow us integrative models that are customizable and applicable in various contexts; other factors such as metadata help increase the extent of such customization. Pathways is the next step in our future of faster, more efficient, and more effective AI models. Sustainability and development, hand in hand, no longer enemies. 


Slator Language Industry Job Index Indicates More Jobs for Translators

Slator’s very own Language Industry Job Index (LIJI) saw an upward trend in the March of 2022, indicating more hiring activity in the language industry. Designed to “track employment and hiring trends in the global language industry,” the LIJI is useful for scoping out the current state of the language industry around the world. The Slator LIJI increased by more than three points—to a 2022 high of 176.6, although this is slightly lower than the December 2021 peak of 180.5. 

LIJI also recently announced its analysis of the language industry market, reporting that the combined US-dollar revenue of major LSP companies rose by more than 22% in 2021. Among these companies are Japan’s Honyaku Center, which reported that “overall sales for the first three quarters increased 6.1% year-on-year to JPY 7.53 bn (USD 65m)” between April 1 to December 31, 2021. In Australia, Ai-Media—a captioning service provider—saw revenues rise 29%; UK-based RWS also saw revenues rise, crossing the billion-dollar mark for the first time. These are just three of numerous LSP companies that have seen a general upward rise in their revenues—and thus, hiring as well. LIJI also tracked LinkedIn’s data and found that there were more than “600,000 profiles under the Translation and Localization category.”

Things are looking up for the language industry overall; artificial intelligence and machine translation have not driven human translators out, as people once feared would happen. There seems to be a counterintuitive, strained relationship between artificial intelligence and human translation: a relationship that can’t easily be put on a scale of good or bad, right or wrong, in or out, etc. Rather than a mode of competition (between machines and humans), the new paradigm is one of cooperation: human translators harnessing the possibilities of artificial intelligence to assist in their work (as is the case of computer-aided translation tools, among others.) Prospects for translators and linguists, then, are still safe and well-guarded from the reaches of artificial intelligence. There’s even data to back that up. 




Lokalise Goes Green and Other News from the Language Industry

Argos Multilingual and Lokalise Extend Language Technology Partnership

Argos Multilingual, a global language solutions provider, recently announced that it has officially extended its partnership with Lokalise, a popular localization platform. Lokalise has an existing partnership with Venga Global, which was acquired by Argos Multilingual in October 2021; this partnership will now be extended to Argos and Lokalise so that clients of both companies can benefit from their newfound partnership. 

This partnership will affect the already significant number of organizations that utilize Argos Multilingual’s language solutions alongside the Lokalise platform. With the newfound partnership, clients will have an easier time collaborating and processing optimization; it will also allow Argos to provide consultation to clients and recommend Lokalise to best fit their needs. Things are looking good for Lokalise, as they have also recently raised 50 million USD in a Series B investment round, which will be utilized to further accelerate hiring, product development, and partnerships. Here’s what Libor Safar, VP of Growth at Argos Multilingual, says about the partnership:

A number of our clients already use Lokalise, and we see a constantly growing interest in the platform. And not just in the high-tech space, but also in other industries such as life sciences, as organizations increasingly invest into digital products for their customers… We share Lokalise’s goal to simplify the localization process and so we’re thrilled to welcome them among our technology partners.

Over at Lokalise, Petr Antropov, CRO, states a similar sentiment:

Building partnerships is one of our key focus areas, and partnerships with Language Services Providers (LSPs) like Argos Multilingual are another big step in helping our customers achieve their goals. Lokalise remains a pure-play technology provider, and as we focus on the further extendibility and openness of the Lokalise platform, our customers and LSPs should be able to both work together effortlessly and connect the machine translation engine or translation memory of their choice.

The Argos-Lokalise partnership attests to the growing interest in and necessity of close collaboration between technology providers and language service providers; combined, these partnerships provide the most optimal localization and platforming services to clients in need of better outreach, service function, and marketing. 


Memsource Opens US Data Center for North American Customers

The Prague-based CAT (computer-aided translation) tool company, Memsource, has just opened a second data center in the United States, catering to the growing number of customers in the United States. While Memsource was formerly available only as an EU-based data center, its new development in North America is a sign of the company’s growing presence in the US market. Furthermore, this addition is especially helpful for customers with compliance requirements to store their data in the United States. 

From February 22 this year, new customers can choose between using Memsource’s EU data center or the US data center upon registering for service. Accounts created for the US data center will reside completely and only in the US, and no data will be shared with the EU data center; the opposite stands true. Martin Švestka, director of product management, says this about the new center: “Letting new customers choose where their data is stored is in line with our commitment to raising the bar for security and compliance standards in the translation world.”

The new center is beneficial for customers dealing with compliance requirements, as some US organizations require their data to be stored domestically. The data centers are completely separate cloud infrastructures, between which there is no data sharing, integration, or migration path. Not only that, but the new center also improves performance without transatlantic latency impacts for North American customers. Likewise, new customers in East Asia are expected to see increased performance if they choose the US data center. 


Lokalise Goes Green with Its Carbon-Neutral Localization

In line with the current climate change crisis, Lokalise—a Riga-based localization and translation service provider—has become the first major translation management platform to go carbon neutral. It’s a title not many companies can boast in their field, as well as a title that will become more and more important in the coming years. 

In a blog post, VP of Engineering Elliot Kim explains how he has pushed the company in a greener direction, thanks to the help of similar-minded people and the company’s willingness to do its part for the environment. Not only will Lokalise’s shift into carbon neutrality attract more customers who are keen to work with eco-friendly companies, but it also signifies the translation industry’s participation in the global movement towards climate-friendly policies. 

Kim further elaborates on the company’s decision to go green, citing one of the company’s values: “Take ownership. Optimize for impact.” Kim then explains how, “within weeks” of suggesting the idea, “a group spread across nine countries organized their efforts to research the topic, interviewed sustainability experts, and calculated our carbon footprint.” He, alongside other proponents of the change, pitched their idea to the company founders and the rest of the management. 

This begs the question: how does Lokalise exactly plan to make this change? Kim further expounds on this topic; first, they plan to opt into the Stripe Climate initiative, which automatically funnels a percentage of the company revenue into high-quality carbon removal projects. “But that’s not enough,” says Kim, noting how there are variable costs to be considered when donating to Stripe, making it hard for Lokalise to predict exactly how much money is being spent to help with climate change. As a result, Lokalise will start to calculate its carbon footprint on a quarterly basis; in the case its commitment to Stripe Climate fails short, the company will find other high-quality ways to offset it. Finally, the company offers three additional ways in which they’re reducing their carbon footprint:

  1. When we travel, we’re encouraging land-based options over flights (air travel emits more than 3x the amount of carbon as a train per unit of distance). 
  2. New employees have always had a budget to set up their home office. Now they can choose to donate a percentage of their budget to planting trees. 
  3. We’re going to institute a preference for vendors that share our commitment to reducing carbon in the atmosphere. 

Lokalise hopes that other companies—especially its fellow enterprises in the translation industry—join them in their efforts to do their part. 

After all, language service providers operate and utilize numerous NLP models, which consume much energy and emit a surprisingly large amount of carbon emissions. Researchers at the University of Massachusetts Amherst found that models are “costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern tensor processing hardware.” In terms of numbers, an off-the-shelf AI language-processing system produced 1,400 pounds of emissions, which is the equivalent of one person flying roundtrip to and fro New York and San Francisco. If we take into account experiments needed to build and train AI language systems from scratch, that number goes up to 78,000 pounds of carbon dioxide: twice as much as what the average American exhales over an entire lifetime.

Building on this, a study by researchers at the Manipal Institute of Technology found that, among popular language pairs between English, French, and German, languages paired with English were the most carbon intense. The emissions correlate directly with better performance. There hasn’t been research done into what exactly causes the differences in emissions, and neither is there research for carbon emissions of low-resource language pairs. 

Lokalise is building on the work of these researchers, carrying out in real life what needs to be done, as suggested by the shocking results of these researches into a greener AI ecosystem. Researchers at the Allen Institute for AI, Carnegie Mellon University, and the University of Washington have proposed that companies utilizing or developing artificial intelligence follow principles of “Green AI”: a paradigm in which efficiency is a major evaluation criterion alongside accuracy and related measures. This is in contrast to previous paradigms which focused mainly on obtaining “state-of-the-art results in accuracy through the use of massive computational power—essentially “buying” stronger results.” In the authors’ words:

The vision of Green AI raises many exciting research directions that help to overcome the inclusiveness challenges of Red AI. Progress will reduce the computational expense with a minimal reduction in performance, or even improve performance as more efficient methods are discovered. Also, it would seem that Green AI could be moving us in a more cognitively plausible direction as the brain is highly efficient.

It’s time to start thinking differently, as Lokalise does: moving toward a greener infrastructure of growth. 




Updates from the Language Industry: LSPI, NeuralSpace, 3Play Media, and RWS

The Slator 2022 Language Service Provider Index

2022 is starting strong for the language industry, according to Slator’s very own 2022 Language Service Provider Index (LSPI), which has gathered revenue data from 295 language service providers from around the world. The Slator LSPI charts the growth (and decline) of the world’s largest translation, localization, interpreting, and language technology companies. The 2022 index features more than 100 new companies compared to last year’s index.

Slator divides LSPs into four categories based on their revenue:

  1. Super Agencies: revenues greater than 200 million USD
  2. Leaders: revenues greater than 25 million USD and under 200 million USD
  3. Challengers: revenues between 8 million USD and 25 million USD
  4. Boutiques: revenues between 1 million USD and 8 million USD

In Slator’s podcast (slatorpod), research director Esther Bond and managing director Florian Faes provide some highlights of this year’s LSPI. Of the 295 companies listed on the 2022 LSPI, 5 are super agencies, 55 leaders, 49 challengers, and 186 of them boutiques, revealing that most companies in the language industry landscape are small in size. Geographically, Europe is home to most LSPs on the index, with 139 companies based in Europe, excluding the UK, which on its own had a whopping 33 LSPs. North America claims 68 LSPs, 58 of which are located in the United States. Asia is home to 20 of the companies on the index; there are an additional 35 that are located in areas not mentioned above. 

According to Bond and Faes, most companies on the index experienced substantial growth: worthy of note, given the dire situation COVID-19 has placed many industries in. The LSPI estimates a near 22% growth in the combined US-dollar revenue, taking the total number up to 9.4 USD. However, it is important to understand that a significant percentage of this number is the result of M&A(mergers and acquisitions)-driven consolidation. Organic growth, on the other hand, featured in the low digits all around. Taking into account both M&A and organic growth, these are the growth rates for companies based on category:

  • Super Agencies: 33% growth from 2020
  • Leaders: 18% growth from 2020
  • Challengers: 14% growth from 2020
  • Boutiques: 16% growth from 2020

This growth complements decline in some of the companies; 10% of leaders, 8% of challengers, and 10% of boutique companies reported some level of decrease in their revenue. Lastly, a new category for the 2022 index is the headcount, in which Slator requested companies to share their employee numbers. The typical LSP employs about 137 people, although the median could tell a quite different story, says Bond. 

Overall, the Slator 2022 Language Service Provider Index (LSPI) tells a positive, uplifting story about the state of the language industry. Despite the oppressive regime of COVID-19, LSPs are thriving, spurred on by international cooperation and developments in the AI sector, among other reasons. 


NeuralSpace Raises 1.7 Million USD

Another positive news from the language industry: NeuralSpace, a London-based SaaS (Software as a Service) platform, has raised 1.7 million USD in a seed round led by Merus Capital. NeuralSpace offers developers a “no-code web interface and a suite of APIs” for NLP tasks without having to know any machine learning or data science knowledge; available applications include Natural Language Understanding (LNU), entity recognition, machine translation, transliteration, and language detection. 

For enterprises that need such functionalities, NeuralSpace offers easy and compatible NLP applications for companies to implement into their websites, products. etc. The possibilities of NeuralSpace’s applications are broad; the company cites use cases in areas such as media, entertainment, gaming, electronics, appliances, healthcare, wellness, automobiles, education, banking, and financial services.

NeuralSpace’s offerings are similar to that of its competitor, Hugging Face, another company that provides open-source NLP technologies. What sets NeuralSpace apart, however, is its focus on low-resource languages. In its mission statement, NeuralSpace CEO Felix Laumann notes that “more than 90% of all NLP solutions are exclusively available for European languages… only 6% are available for low-resource languages, mostly spoken in Asia and Africa.” 

According to Slator, NeuralSpace currently generates 100,000 USD annually and has a 19-person team. The company relies on transfer learning and combining datasets to provide support for low-resource language translation. 


3Play Media Acquires Captionmax

In the media industry, 3Play Media—the leading video accessibility provider—announced the acquisition of Captionmax, which specializes in live and recorded captioning, localization, and audio description services. According to BusinessWire, the acquisition includes National Captioning Canada (NCC), “the largest live captioning provider in Canada and a subsidiary of Captionmax, providing exciting geographic expansion for 3Play Media into the Canadian market.”The terms of acquisition were not disclosed. 

3Play Media’s acquisition of Captionmax means that 3Play’s live captioning services will undergo more developments and improvements while allowing the company to expand its services into Canada. Josh Miller, co-CEO of 3Play, told Slator that the transaction will allow the company to “scale rapidly and be a leader in the media-accessibility market.” 

Live captioning is a growing field; related applications such as closed captioning, transcription, audio description, and subtitling offer multifold enhancements to viewers of media and entertainment, as well as sports, education, technology, enterprise, government, and e-learning, writes Slator writer Esther Bond. 

3Play also offers a platform that utilizes machine learning and automatic speech recognition (ASR) alongside human services and human review to provide support in “the voice-writing process, the failover to auto captioning, and the postback of captions to video platforms.” According to Bond, “3Play can integrate with multiple video and meetings platforms and generate captions from a Real-Time Messaging Protocol (RTMP) stream.” Miller explains that, upon being input audio, “a human captioner re-speaks the dialogue into primed speech recognition to produce a highly accurate live captioning product.” 

A unique capability of 3Play’s platform is that, according to Miller, “if at any point the captioner is disconnected, we will failover to automatic captions, then back to human captions when the captioner is reconnected—without any manual intervention,” ensuring that live captioning during an event never blanks out. Miller goes on to add that, “in a post-Covid world, it’s becoming clear that hybrid, virtual, global, and full time remote lifestyles are here to stay,” positing that there is a great need for captioning services in the future, near and far. 


RWS Hits 1 Billion US Dollar Revenue

RWS, a UK-based language service provider, has recently announced great growth in the year 2021, having generated an annual revenue of 694 million GBP (940 million USD). This places RWS in second place on the 2022 LSPI leaderboard, most of its growth coming from its acquisition of its major rival, SDL. Organic growth was recorded at 4%. RWS has a global team of over 7,500 employees specializing in localization, content creation, artificial intelligence, and IP services, providing one of the biggest language services in the world. 

The information comes mainly from RWS’s annual general meeting (AGM) statement, in which the company Chairman Andrew Brode made a series of remarks regarding the company’s strong financial growth and positive prospects for the near future, such as this highlight of the company’s recent progress:

[RWS] delivered a strong set of results in its 2021 financial year, with a better than expected profit performance, against the background of the Covid-19 pandemic. This was a year in which the Group acquired SDL, creating a world leading provider of technology-enabled language, content and IP services, doubling the Group’s size, and adding new client relationships and capabilities.

According to Brode, RWS saw strong performance from its Language Services and Regulated Industries divisions, which offset weaker performance in the IP Services division. With this strong performance, the group intends to invest for growth in software and internal systems, as well as in selective acquisitions. “Our strategy,” says Brode, “will harness our broader technologies to deliver new solutions to clients, drive further operating efficiencies and ensure the Group is at the forefront of the technology-led evolution of our industry.” RWS expects 2022 to bring the company over 1 billion USD in revenue. 




Meta to the Rescue: New Projects for Translation Models Announced

Meta (formerly known as Facebook) has just announced that they will be investing in research for a universal speech translator and a new advanced AI model for low-resource languages. The news comes some three months after Meta’s translation model beat out other models at the 2021 Conference for Machine Translation. 

The tech giant’s devotion to translation is heartwarming—and sorely needed. For people who speak one of the major languages of the world—English, Spanish, Mandarin—translation is no big deal; many high-quality translation programs are available online for speakers of these languages. At the same time, there are billions of people out there who cannot access the abundance of information available on the internet, merely on the basis of their native tongue. Advances in machine translation can help bridge these gaps and eliminate barriers in communication. Meta claims that these developments will “also fundamentally change the way people in the world connect and share ideas.”

But the path to universal communication isn’t quite so easy. The problem with MT systems today is that they don’t work well for low-resource languages, as there simply isn’t enough training data. Meta lists three major hurdles in perfecting machine translation: overcoming data scarcity by acquiring more training data in more languages; overcoming modeling challenges that will arise as models scale to accommodate more and more languages; and finding new ways to evaluate and improve on results. 


Problem 1: Data Scarcity

Data scarcity is perhaps the greatest of machine translation’s problems, especially for languages that are not spoken by many people, and all the more so for languages that don’t have written scripts. MT development currently relies heavily on sentence data for improving text translations; as a result, only languages with plenty of text data (English, Spanish, etc.) have been the focus of MT development. 

This is even more of a problem for direct speech-to-speech translation. Speech-to-speech translation is more limited in its capacity than text translations, as the former most frequently utilizes text as an intermediary step (speech is transcribed in the source language into text, translated into the target language, and then input into a text-to-speech system for audio generation.) This dependency on text greatly inhibits speech-to-speech translation’s efficiency, not to mention the sheer lack of speech recordings to utilize as analyzable data.


Problem 2: Scaling

The second challenge for MT development is to overcome modeling challenges as MT systems grow bigger and bigger to accommodate more languages. So far, many MT systems have approached translation from a bilingual perspective, utilizing separate models for each language pair (such as English-Russian or Japanese-Spanish, for example). While bilingual models work well as a single model, it is unrealistic to utilize thousands of different models for the thousands of languages in the world; we can’t simply create a new model every time we need translations for a different language pair. 

Translation research has been looking into ways to overcome these limitations; recent developments reveal that multilingual approaches—not bilingual—are more efficient at handling larger combinations of language pairs. Multilingual models are simpler, more scalable, and better for low-resource languages; until recently, multilingual models couldn’t match bilingual model performance for high-resource languages such as English and Spanish, but that is no longer the case. Meta’s translation model that outperformed other models at the 2021 WMT—that was a multilingual model, too. For the first time, a single multilingual model has outperformed bilingual models in 10 of 14 language pairs, including both low- and high-resource languages. We have a long way to go, however; Meta notes that “it has been tremendously difficult to incorporate many languages into a single efficient, high-performance multilingual model that has the capacity to represent all languages.” 

For real-time speech-to-speech MT models, the same challenges remain alongside an additional difficulty: latency problems. Latency here refers to the lag in real-time translation; this is one problem that must be overcome before speech-to-speech translation can be used effectively. A major reason behind latency problems is varying word order in languages; professional simultaneous interpreters also struggle with this problem (with average latency time being around three seconds for human professionals). Meta provides this example for understanding latency problems:

Consider a sentence in German, “Ich möchte alle Sprachen übersetzen,” and its equivalent in Spanish, “Quisiera traducir todos los idiomas.” Both mean “I would like to translate all languages.” But translating from German to English in real time would be more challenging because the verb “translate” appears at the end of the sentence, while the word order in Spanish and English is similar.


Problem 3: Result Evaluation

A crucial part of developing MT translation is evaluation; it’s a step many people tend to overlook, but machines, like human translators, need feedback to improve. While evaluation models and standards exist for popular language pairs (e.g. English to Russian), such standards aren’t readily available for more obscure language pairs (e.g. Amharic to Kazakh). Like the transition from bilingual to multilingual models, we need to start thinking of multilingual, comprehensive, and one-for-all evaluation methods, so that we can evaluate MT system performance for accuracy and make sure translations are carried out responsibly. Such evaluation should include making sure MT systems preserve cultural sensitivities and do not amplify biases present within natural languages. 


The main question now is: how do we overcome these three challenges specific to MT translation development? Thankfully, Meta also has answers to these questions; while their answers are still largely hypothetical, they have plenty of previous data and research to back up their answers. 


Solution 1: Managing Data Scarcity

Meta’s answer to the problem of data scarcity is to expand its automatic data set creation techniques. For this, Meta brings in the LASER toolkit, short for Language-Agnostic SEntence Representations; LASER is an open-source, massively multilingual toolkit that converts sentences of various languages into a single multilingual representation. Afterward, a large-scale multilingual similarity search is carried out to identify sentences that have a similar representation. 

A graphic illustrating the relationships automatically discovered by LASER between various languages. Note how they correspond to the language families manually defined by linguists. Image and caption credits: Meta AI, “Zero-shot transfer across 93 languages: Open-sourcing enhanced LASER library”

The image on the left shows a monolingual embedding space, whereas the one on the right illustrates LASER’s approach, embedding all languages in a single, shared space. Image and caption credits: Meta AI, “Zero-shot transfer across 93 languages: Open-sourcing enhanced LASER library”

A table illustrating how LASER was able to determine relationships between sentences in different languages. Image and caption credits: Meta AI, “Zero-shot transfer across 93 languages: Open-sourcing enhanced LASER library”

Using LASER, Meta has developed systems such as ccMatrix and ccAligned, programs that are capable of finding parallel texts online. LASER is also capable of focusing on specific language subgroups (such as the Bantu languages) and learning from smaller data sets. The possibilities of LASER are endless, as it allows for seamless, surefire scaling across a wide expanse of languages. The possibilities don’t end there; recently, Meta has extended LASER to work with speech as well; they have already identified nearly 1,400 hours of aligned speech in English, Spanish, German, and French. 


Solution 2: Building More Capable Models

Meta is also working to improve model capacity so that multilingual models can perform better even when scaled to accommodate more languages. Current MT systems suffer from performance issues, which lead to inaccuracies in text and speech translation; this is because current MT systems work within a single modality and across a select few languages. Meta hopes for a future in which translations work faster and more seamlessly, whether it’s going from speech to text, text to speech, text to text, or speech to speech. 

To achieve this goal, Meta is investing heavily in creating larger, more robust models that train better and function more efficiently, learning to route automatically so as to balance high-resource and low-resource translation performance. Meta talks of their recent development of M2M-100, the first multilingual machine translation model that is not based on English. By eliminating English as the working intermediary language, translations from one non-English language to another become more fluid and efficient, allowing translation to achieve the same level of customized bilingual systems with even more language pairs. 

As for latency problems, Meta is currently working on a speech-to-speech translation system that “does not rely on generating an intermediate textual representation during inference”; such a paradigm has been shown to be faster than a “traditional cascaded system that combines separate speech recognition, machine translation, and speech synthesis models.” 


Solution 3: Coming Up with New Evaluation Models

Evaluation, the important step in MT development that it is, still has a long way to go in the context of massively multilingual translation models. After all, we need to know whether certain developments in MT are actively producing better data, models, and outputs. Meta notes that evaluating large-scale multilingual model performance is a tricky job, as it is a “time-consuming, resource intensive, and often impractical” challenge.

To this end, Meta has created FLORES-101: the first multilingual translation evaluation data sets covering 101 languages; FLORES-101 allows researchers to rapidly test and improve multilingual translation models by quantifying the performance of systems through any language direction. While FLORES-101 is still in the process of development in collaboration with other research groups, FLORES-101 is a critical part of a well-functioning massively multilingual model. 

The FLORES data set compared to the Talks data set. Image credits: Meta AI, “The FLORES-101 data set: Helping build better translation systems around the world”

Meta’s vision of a connected future is grandiose and visionary. Meta claims that such developments will “open up the digital and physical worlds in ways previously not possible,” as the company slowly removes barriers to universal translation for a majority of the world’s population. Meta should be lauded for their inclusive, collaborative efforts; the company open-sources their work in corpus creation, multilingual modeling, and evaluation so that other researchers can join in their efforts. After all, if Meta’s mission is to open the world up to free, unlimited communication, it only makes sense that their efforts are also based on collaboration and communication with external researchers to accomplish their dreams. 

But the last paragraph of Meta’s announcement is particularly beautiful: 

Our ability to communicate is one of the most fundamental aspects of being human. Technologies — from the printing press to video chat — have often transformed our ways of communicating and sharing ideas. The power of these and other technologies will be extended when they can work in the same way for billions of people around the world — giving them similar access to information and letting them communicate with a much wider audience, regardless of the languages they speak or write. As we strive for a more inclusive and connected world, it’ll be even more important to break down existing barriers to information and opportunity by empowering people in their chosen languages.

As one of the world’s leading tech groups, Meta has the power and duty, among other things, to invest its resources in making the world a more connected place. The dream of a completely connected world: that is the direction of technology, and machine translation lies at the heart of it. 




Pwning Noobs and Typing in All Chat: Trends in the Gaming Translation Industry

Before the turn of the century, sports and films were the main sources of entertainment for the bored. Now, it’s flashy fight scenes and rapid clicks as people—young and old alike—turn to gaming as their main mode of fun. A report by Mordor Intelligence forecasts that the gaming industry, currently valued at 177 billion USD, is expected to nearly double its value by 2027. “The gaming market is growing… With the increasing use of smartphones and consoles and cloud penetration, the market shows high potential growth in the future,” the report states. 

The rise of gaming is accompanied by subsequent growth in the translation industry. Despite the universal appeal of video games, languages are still a hurdle; popular video games are often made in a few select languages, such as American English, Chinese, Japanese, and Korean—obvious, given these countries are powerhouses for video game production. According to research by translation company Andovar, close to “75% of the game market’s total global sales are just from the largest game sellers, which includes Electronic Arts (EA), Tencent Games, Sony, Nintendo, Activision Blizzard, Bandai Namco, and Ubisoft.” 

2019 game localization revenues. Newzoo


2020 global games market. Newzoo.

2020 global gamers per region with year-on-year growth rates. Newzoo.

Many game localization providers have expressed positive attitudes toward their future, reportedly projecting 20-30% increases in their revenues in the coming years. Game localization is a key step in the production process, making sure users worldwide are able to enjoy the game in their respective languages. Regarding this growth, Andreea Balaoiu of the translation service Ad Verbum points out three general trends in the game localization business to keep an eye out for. 


1) Cultural Sensitivity & Translation Accuracy

Balaoiu notes that cultural sensitivity is an increasingly important concern in game production. It’s always been an issue, yes, but with growth and development in the industry comes greater attention to detail and accuracy. For example, it’s no longer acceptable for Latin American gamers to simply enjoy games localized for Spain; the linguistic differences, once accepted out of necessity, are too great to ignore. A recent petition against Nintendo’s Pokémon game localization has brought about great controversy, as gamers point out that a lack of localization that have led to grave errors in translation. 

The petition, signed by more than 20,000 gamers, claims that European Spanish terms and phrases are not always appropriate for the Latin American market; the slightest changes in word choice can lead to vulgar expressions that are not suitable for younger gamers. As of 2020, Latin America is home to more than 259 million gamers; as such, proper localization practices should be the norm. 


2) The Metaverse

With Facebook’s recent transition to Meta, the world has entered a new era in which virtual experiences rival, sometimes trump, that of the real world. Balaoiu’s point is that the complex nature of the metaverse requires even more delicate translation and localization work. She notes the complexity of metaversal content, “from textual content (UIs, captions, subtitles, in-video game text) to audio content (voice-overs, dubbing) and holistic creative and marketing content (graphics, artwork assets, digital advertising, landing pages, social media pages) all the way to labels and policies.”

If, previously, game companies were allowed to get away with shoddy localization practices, companies can no longer put off localization as games come to mimic reality, more so than ever. In the future, near and far, we expect to see more interest and attention given to game localization providers as the metaverse expands. 


3) Technological Developments

With the demand for better technology in the language industry, there also comes a demand for QA, technology integration, and NMT support personnel. Translation relies heavily on MT, especially in computer-aided translation (CAT) tools. The same applies to virtual and augmented reality technology. 


However, there are credible reasons why companies often refuse to have their games localized. Localization is a complicated, lengthy process that can’t be rushed (although many companies do, ending up with a messy, incoherent product). In a Game Global article by MT specialist Cristina Anselmi and localization veteran Inés Rubio, the authors cite three specific difficulties unique to in-game text localization:

Terminology: Consistency in terminology is fundamental not only to ensure a good gaming experience but also to prevent noncompliance issues that might hinder the release of the game.

Variables and tags: The presence of variables and tags poses a technical challenge. Variables and tags both need to be respected in the translated text, as they will be replaced by the player name (for instance) or by a link to a screen in the game itself. Sometimes they are just cosmetic tags to modify the text format. Mistakes could result in code errors that would provoke functionality and display issues, disrupting playability.

Creativity: Aside from the technical components, one of the biggest challenges with NMT in the gaming industry is creativity. The types of texts can vary a lot from conventional on-screen text, and due to this, the required level of creativity changes. Apart from audio recordings, which by nature need to be quite liberal and natural-sounding, we often come across made-up language or puns and jokes that need to be transferred to the target language. 


The Necessity of MT in Game Localization

Despite these difficulties (or rather, because of them), Anselmi and Rubio argue for the implementation of MT technology in the game localization industry. Game localization occupies a unique space, somewhere between technical translation (dealing with code) and creative writing (in-game text); as a result, working with MT can help tremendously with speed and accuracy. In the modern workplace, where speed is valued over quality, translators can manage to find themselves relevant by producing high-quality outputs by utilizing the full functions of MT. 

Anselmi and Rubio point out how “MT can help by being more agile and increasing the speed of delivery.” But speed isn’t the only benefit to implementing MT, as MT also helps with utility. “MT might be needed if only to understand the general meaning of specific documentation for internal purposes,” the authors write, “to avoid dedicating precious resources and time to this task.” Then, of course, there is cost. Machine translation can help companies save money by decreasing time and resources spent on translation. 

Konstantin Savenkov, CEO of Intento—an AI startup—reminds readers of his Game Global interview to understand that “a lot of content [in the game industry] needs to be translated in real-time.” Carefully meditated, delicately written in-game text isn’t the only source text that needs to be translated in the localization process; there are also “other types of content to translate aside from product-related content” such as “support tickets, support chats with customers, and even chats between players” and “internal use cases, i.e. communication amongst developing teams who might sit in different locations all over the world or software documentation.” While these tasks don’t necessarily fall under the realm of localization, they are added benefits of MT implementation alongside product translation and localization. 


The Current State of Translation in the Gaming Industry

All this is idealistic. Despite advancements in MT technology and translators’ willingness to work with MT, game companies often spend a less-than-adequate amount of funding to properly localize their games. This affects gamers all over the world, hindering gameplay and user experience. In an article for Input, reporter Jay Costello writes of various mistakes caused by improper localization practices, taking the example of Brazilian Youtuber Rodrigo Soncin:

In Hexen II, for example, [Soncin] says translators appeared to struggle with words that seem like English homonyms, but aren’t. In a dungeon, one text read “você encontrou o tombo do Lorick” or “you have reached the tumble of Lorick.” Presumably, Soncin says, they meant “tomb” – tomba not tombo. In another case, he searched for some people, when he should have been looking for a staff. Somehow, the sceptre had been mistranslated as a group of employees.

Costello interviews Japanese to English translator Katrina Leonoudakis, who stresses the importance of ample resources and communication for high-quality localization. “Localization involves dozens of people per language,” says Leonoudakis, “no localization decision happens in a bubble, and good localization teams have strong communication line between everyone involved.” 

The problem, then, is not the state of MT technology; MT is advanced enough to be used as an aid for translators to work faster and more accurately. Rather, the problem at hand is the general lack of understanding of why localization is time-consuming yet necessary. “The main issue is… low pay, really tight timelines… the result is always low quality,” says Emma Ramos, a former translator who’s worked both freelance and in-house.

The solution? Better awareness on the part of management to allocate appropriate funding for localization and translation steps in product development. But that is a great feat, a luxury translators can’t fantasize about as they hurriedly type away, mistakes and typos slipping past their finger as they try to meet outrageous deadlines. That, however, will always be better than machines, ingesting and throwing out poorly processed, badly translated texts for the next user or player to grimace over. 


Here at Sprok, we’re dedicated providing the best translation and localization to our customers. If you’re looking for qualified, professional translators and localization specialists to work on your project, visit sdts.pro now and ask for a free quote.





2022 Regional Trends in Translation and Localization

As the term suggests, “localization” is the process of adapting a text from one language to another to best fit the traditions and expectations of a specific local market. Localization is necessary precisely because of the vast differences in language and culture between countries and regions. The differences don’t stop there, however; each region and country charts a specific growth according to the political, economic, and cultural factors that shift day by day, year by year, creating varied environments in which translation flourishes—or dies off. 

In light of translation’s wide-reaching, global, and multicultural possibilities, we introduce today major translation and localization trends of 2022, arranged by geographical region. By now, it’s clear what the common global trends in translation are: machine translation, voice-to-voice translation, post-editing, video media translation, etc. However, each continental region has its unique demands and capacities that need to be addressed; keeping these differences in mind helps provide a better, nuanced idea of how the translation industry is evolving as a whole.



The computer-aided translation (CAT) company MemoQ has been keeping track of developments in the Asia-Pacific region, noting how dramatically the translation industry has been growing there, especially in China and Japan. The Asia-Pacific region is one of the fastest-growing language service provider (LSP) markets, with Japanese and Chinese LSP services dominating the regional market. Japan’s Honyaku Center and China’s Pactera Technologies now place in the global top 10 LSPs as of 2020; accordingly, MemoQ has deployed a second server in Japan. 

MemoQ is on the watch for growth in strategic partnerships in the region, picking out China’s Pactera Technologies in particular. The Beijing-based company has received Microsoft’s award for top China System Integrator, making it “one of Microsoft’s leading partners in the worldwide LSP market.” This is just the beginning, MemoQ notes, of “strategic alliances between the tech sector and LSPs, especially with cloud computing platforms and machine learning developers.” With the continued rise of Japan and China in the global translation market (and commerce in general), it is likely that more of these alliances will form and grow over the coming years. 

The Microsoft-Pactera alliance signifies yet another important trend in the Asia-Pacific region: improved LSP customer experience with cloud-based solutions. Pactera has had great success using cloud-based technology to improve customer experiences; MemoQ advises clients to “watch for more LSPs to add CXOs to their teams to drive client relationships, as customer experience becomes a high priority.” Pactera’s implementation of machine learning for quality control has made it one of the most innovative LSPs, single-handedly driving up the demand for AI-infused translation processes to better equip translators and achieve high-quality outputs. 

All this development in the Asia-Pacific region owes largely to government investments in the translation sector. The Japanese government is investing 19 million USD for the development of simultaneous interpretation technology, following the Chinese government’s investment in local LSPs under the country’s One Belt One Road initiative. Such government backing in the sector means “competition [will] heat up between the Asia-Pacific sector and the rest of the global translation industry,” claims MemoQ. 

Finally, the Asia-Pacific region (especially China and Japan)’s healthcare industry is expected to grow to $11.9 trillion; after all, the region is home to the second-largest healthcare industry in the world. As a result, the demand for medical translators will grow drastically, predicts MemoQ. 



Europe is a powerhouse in the language and translation industries, home to numerous top-ranking LSPs; the region boasts a long tradition and history of translation as well. In the 2021 European Language Industry Survey, Rudy Tirry of EUATC (The European Union Association of Translation Companies) gives a brief overview of the trends, issues, and worries European translators and LSPs have about the imminent future of the language industry. 

As to be expected, the survey revealed that machine translation remained the single most important trend in Europe, as voted by training institutes, buyers & language departments, and independent professionals. Translators expressed general concern over machine translation quality improvement but also commented that such improvements will lead to more focus on MTPE (Machine Translation Post-Editing), as well as human translation niches. Overall, the frightening speed at which machine translation has revolutionized the translation industry has forced LSPs and translators to “rethink operations” and grapple with “better software” in an attempt to stay relevant and competitive in an industry that becomes more crowded with each passing day. 



One can’t speak of economic growth in Africa without also speaking of the inherent inequalities that have limited growth in the continent in the past decades. With so much translation development focused in North America and Europe—mainly around English and high-resource Romance languages—there is hardly any attention given to researchers, scientists, and researchers working in Africa to improve communication between the 2,000+ languages spoken in the continent. 

Leading translation research in Africa is Vukosi Marivate, a founding member of Maskhane—“a pan-African research project to improve how dozens of languages are represented in the branch of AI known as natural language processing.” Maskhane’s mission is to bring African languages to the forefront of data science and artificial intelligence research, directing attention and resources into major languages that are all but ignored by bigger AI and NLP research groups such as Google and Microsoft. 

Working alongside them is writer and linguist Kola Tubosun, who “created a multimedia dictionary for the Yoruba language and also created a text-to-speech machine for the language.” Tubosun is now developing speech-recognition technologies for Hausa and Igbo, Nigeria’s two other major languages; in the past, he has also led a Google research project, creating a “Nigerian English” voice for map applications. 

There is also Remy Muhire, a Rwandan software engineer who is developing a new open-source speech data set for the Kinyarwanda language with the help of native volunteer speakers. Marivate, Tubosun, and Muhire are examples of scientists working to develop language translation technology for African languages. Speech-to-speech translation, as well as translation in major languages such as Igbo, Hausa, and Yoruba, will be crucial to further communication and economic development in Africa. 


Latin America

We recently published a blog post on Viva Translate: a platform offering real-time email translations and streamlined transaction procedures for freelancers in Latin America as they communicate with clients abroad. Viva Translate goes to prove how many professionals reside in Latin America and how large the untapped talent is there. In a sense, Viva Translate exemplifies a successful approach to localized experience: they believe their “offering of specialized MT to individual users is what makes it stand out.” Viva Translate identified the specialized needs of the South American and North American markets, which is something larger companies (Microsoft Translate, Lionbridge, etc.) cannot do with as much grace. 

More recent issues that have arisen in the Latin American translator discourse are the Pokemon games and their blatant lack of Latin American localization. Nintendo released the European Spanish version of the games without taking into account the differences between European Spanish and the various dialects of Latin American Spanish, leading to serious misunderstanding. A petition, signed by more than 20,000 people, clarifies some of the differences:

In Pokémon X / Y, there is a Flare Grunt who says “nos importa un pito”. This may sound very mild in Spain (“we care very little about something”), but in Latin America it’s a very aggressive and insolent expression… these are just a handful of examples in a game packed with dialogue. So how do you explain to a Latino kid that the words they see on the screen of their favorite game can’t be said out loud in public?

This particular issue in gaming localization points out the detrimental consequences of not having proper localization processes in place in Latin America—even for games as successful and renown as Pokémon. 


North America

It’s fair to say that developments and trends recorded in North America pertain to the rest of the world, as many of the biggest research firms (Google, Microsoft, Meta, etc.) and the largest LSPs (Lionbridge, TransPerfect, LanguageLine, etc.) reside here. Most of what researchers and professionals cite as “global advances in the translation industry” pertain to what is developed and created in North America—although communication with researchers from all over the world is crucial to the research done in NA. 

At the same time, Ofer Tirosh—CEO of Tomedes—has outlined some major sectors in the North American market that are expected to grow and bloom in the coming years. Machine translation is an obvious answer, and with it, post-editing. Tirosh also points out that business translation and media localization will be important issues to keep in mind, as well as e-learning and medical translation, although the Asia-Pacific market showcases similar trends in those areas. 


Overall, it’s safe to say that Europe, North America, and the Asia-Pacific regions are progressing in a similar direction; researchers in Latin America and Africa, however, face problems with finding enough funding and resources to continue their work. Nonetheless, development continues as economies start to flourish in the wake of the pandemic. 

Another idea to keep in mind is that regions are not necessarily homogenous; for example, China and Japan—while the biggest players in the region—do not represent the Asia-Pacific market as a whole. South Korea, alongside Vietnam and other major Southeast Asian markets, loom ever heavier in the background. The same idea applies to Africa; major local languages develop AI-based communication tools at varying speeds, and thus a more nuanced approach to understanding the market is required. 


Here at Sprok, we are dedicated to meeting your business goals and pursuits. Our translators and localization specialists deliver translations and project localization of the highest caliber, so visit sdts.pro, ask for a free quote today, and enjoy our high-tier professional localization.




The Latest News in Translation: Deepfake Voices, Beijing Olympics, and More

COVID-19 is all that’s on people’s minds. After all, how could we not think about it, when it’s taken so many people and jobs away from us? But while workers in some industries have taken detrimental blows to their livelihoods (e.g. hospitality, airlines), other industries have flourished by taking advantage of the pandemic-era shift into the virtual workspace. Translation is one of them. 

The translation industry has experienced substantial growth in the last two years, citing reasons such as virtual work environments (in light of the pandemic) and increase in media consumption. Today, we’d like to highlight some recent news in the translation sector as a way to discover new trends and meaningful developments in the industry. What does this news tell us about the direction of the translation industry, and how can we stay up to date in this ever-developing industry? Perhaps this article will give you some ideas. 


Stanford University study reveals AI communication favors privileged populations

When we think of artificial intelligence, we tend to think of it as a universal, democratic accomplishment, as if developing a machine nearing human parity is an achievement of the times, a lasting proof of our zeitgeist. However, a 2021 study by researchers at the Stanford Social Media Lab found that developments in AI-mediated communication tools (including transcription, translation, and voice-assisted communication) will be “positively associated with access, socio-economic factors such as education and annual income, and AI-mediated communication tool literacy.” In other words, developments in AI will primarily serve the needs of the privileged population, not its less privileged, under-resourced counterpart. 

The study engaged with various aspects of the computer-human relationship, documenting which kind of AI-mediated communication tool people used the most (voice-assisted communication, language correction, predictive text suggestion) and who was most likely to use it (younger, digital native users). Obvious enough, the results clearly show that the demographics most positively affected by the adoption of AI-mediated tools were middle- and upper-class citizens with “unaccented English.” “Sadly, as we might expect, people with lower amounts of income and people with lower levels of education were much less likely to know about these technologies and use or engage with them in their lives,” said Jeff Hancock, founder and director of the Stanford Social Media Lab.

These cold realities of the current state of AI development force us to ask the question: how do we go about making technology more equally available to people, regardless of class, language, and education? This is a question researchers and scientists—as well as everyone involved in a general field of AI and to an extent, translation—must grapple with. The study does well to point us in a more positive direction towards a more equitable playing field for technology users of all backgrounds and origins. 


AI-based dubbing service Deepdub raises $20 million in funding

The latest news to stir up the language industry is Deepdub’s impressive $20 million Series A funding, led by New York-based global venture capital and private equity firm Insight Partners. They are joined by other investment firms, old and new, as well as private investors such as Emiliano Calemzuk (former President of Fox Television Studios), Kevin Reilly (former CCO of HBO Max), and Roi Tiger (VP of Engineering at Meta). Deepdub is an Israel-based company that provides novel dubbing services for content worldwide, powered by artificial intelligence technology. 

What sets Deepdub apart from other dubbing services is that they utilize AI technology to retain the voices of the original actors, allowing international viewers to watch “their favorite film and TV programs dubbed in native languages without losing any aspect of the original experience.” Combined with their localization efforts, Deepdub is taking AI to new heights with their ingenious idea to apply AI to voice acting, opening up opportunities for AI tech adoption in the realm of media content and entertainment.

But some raise the concern of safety and job security. As was the case with deepfakes, AI is again being utilized to manipulate human aspects. Artificial intelligence has the power to emulate the very real, very human lives of people, and there’s no telling what Deepdub’s aural manipulation can do. Furthermore, Deepdub’s development of voice manipulation technology could possibly put numerous voice actors out of jobs. However, there remains quite a bit of time before Deepdub can produce a functioning model capable of universal use; these are, however, issues to take note of as we gradually acclimatize to a more AI-friendly, AI-driven world. 


Viva Translate opens US translation job market to Latin American professionals

While not as impressive as Deepdub’s $20 million funding, Viva Translate’s $4 million funding does not betray Viva’s (perhaps more) noble and equally important mission of facilitating translation and business between Latin America and the US. With their patented Spanish-English machine translation (supposedly 10 times better than Google Translate), Viva Translate has built a streamlined process through which Latin American professionals can engage with US clients, opening up previously untapped job markets unfilled by American freelancers. 

Viva has come to the forefront of the language industry by making use out of the vacuum of workers in the US—due to the pandemic-induced event aptly named “the Great Resignation”—and allocating capable workers from Latin America to meet demands. The company was founded by a team of Stanford University researchers and utilizes novel functions such as real-time email translation and offer negotiation to aid freelancers in their journey to find work. More than 50,000 freelancers are now utilizing Viva Translate to communicate with clients abroad. 


Real-Time AI translation at the Beijing Olympics

While the pandemic has forced officials to set strict distancing measures in Beijing, Olympic athletes and coaches still had to communicate with local workers and vendors. Domestic translators, such as the iFlytek Jarvisen, have greatly facilitated translation and interpretation between Mandarin speakers and international visitors. The Jarvisen—a small yet formidable device solely for voice translation—translates between 60 languages and handles professional vocabulary in healthcare, IT, finance, legal, sports, and energy fields. 

Usage, it seems, was limited to restaurants, where mistranslations, however infrequent, did occur. German journalist Frank Schneider tells Reuters of an amusing incident where his pronunciation of “cow milk” was understood as “cough milk.” Another anecdote is of mushrooms translated as “fungus.” Such cases prove the fact that, no matter how capable and robust machine translation is, sometimes MT can’t handle basic nuances in normal registers. 


These are the latest news items in the language sector. What do these happenings tell us about the direction and growth of translation (machine and human) in the real world? For one, technological advancement always initiates adoption and implementation in the real-world setting. Despite imperfections and possibly critical mistranslations, MT is utilized in events such as the Olympics to aid in communication; after all, some level of translation, nuanced or not, is better than no translation at all. On the flip side, such implementation of novel technology also goes to prove that human translation is just as necessary and imperative as machine translation. Companies such as Deepdub and Viva Translate marry human resources with technology to provide the most satisfactory results. This narrow space—in which machines and humans intermingle and cooperate—is the most productive of innovation and meaning for years to come, it seems. 


If you liked this article, consider visiting sdts.pro and asking for a free quote on your next translation or localization project. Sprok DTS stays up to date on the latest technological and global developments in the language industry so that we can utilize the power of MT and human translation to deliver the best translation and localization experience.




Responsive Machine Translation and the Search for Metadata

Metadata is familiar enough of a concept to the average person. We know it from movie metadata (director, release date, actors, etc.) and book metadata (author, cover photo, font, ISBN, etc.). Even if one is not familiar with the concept of metadata, the word itself is self-explanatory. “Meta,” from the Ancient Greek for “after”, takes on the meaning of the preposition “about” in modern parlance. Therefore, metadata refers to “data about data”: secondary data that informs the user about the main data at hand. 

Metadata is a potent tool for customizing machine translation to fit the nuanced needs of the user, but not many people know about this. After all, machine translation is a relatively arcane and scientific area of expertise and not the most user-friendly for the average language industry worker. “Media and game localizers may not be taking full advantage of the cost and efficiency benefits offered by MT because translation work is still constrained by the limits of popular black-box MT systems,” writes AppTek in a 2021 article on Slator. Many people don’t quite understand how MT systems work, and as a result, they waste time and effort building models after models, engines after engines, doing pretty much the same things, except in a different context. 

But AppTek has a solution; they’re working on MT systems that can be fitted with better metadata to provide more detailed, customizable translations to their users. “Metadata can be leveraged in the post-editing workflow to increase quality and boost productivity,” they advertise. AppTek challenges us to think outside the (black) box: “what if instead the register of the language required in each case could be taken care of by the same model…?” What if we could, at the flip of a switch, modify the tone, diction, and mood of the speaker? What if we could clarify that the recipient of a translation was a certain gender, class, or position? There would no longer be a need for separate translations each time; no need for building completely new systems for every single register or tone. 

This is Apptek’s new idea: MT that can be augmented with different types of metadata to better accommodate a wider range of linguistic nuances. In a white paper they published in 2020, AppTek details the kind of changes metadata can offer to a static MT system. They outline 8 specific metadata categories that can radically facilitate machine translation users:

  1. Style
  • formal, informal
  • for example, tu/usted (Spanish “you”) or Sie/ihr (German “you”) distinctions in the second person, and corresponding grammatical inflections
  1. Speaker Gender
  • male, female, nonbinary
  • corresponding grammatical inflections and contextual understanding of the text
  1. Domain or Genre
  • data regarding tone or language depending on the form of the content (news, entertainment, talks, etc.)
  • e.g. Essen vs. Nahrung (“food”) in German
  1. Topic
  • caters to “more specific document-level style and terminology differences”
  • allows MT systems to understand the context, thereby allowing for more accurate translation for ambiguous word choices
  1. Length
  • users will have the ability to control the length of the translation with “minimal information loss or distortion”
  1. Language Variety
  • MT systems will be able to parse through and translate mixed-language content
  • can be useful in English-Hindi, Ukrainian-Russian, Castilian-Latin American Spanish, European-Brazilian Portuguese, and other hybrid or dialectal language combinations
  1. Extended Context
  • allows MT systems to assess “whether or not the context of the previous or next source sentences should influence the translation of a given sentence”
  • MT systems will understand documents much better as a whole, allowing for better pronoun and noun agreement
  1. Glossary
  • MT systems will translate certain words according to a given glossary, allowing for more conformity in the final result

These are but some of the numerous possibilities metadata opens MT systems up to. While not a tremendously revolutionary development, the application of metadata to MT solves “post-editing challenges in ways not possible in previous NMT generations,” slowly raising the bar for MT machine intelligence and capacity. 

Arle Lommel, senior analyst at CSA Research—an independent market research firm—claims that the next major trends in machine translation will be the “shift to context-driven MT” and the “emergence of metadata-aware MT.” Having the proper metadata to account for nuances and important factors in translation will greatly improve the level of human parity in MT systems, Lommel argues. Lommel calls this kind of MT system “responsive machine translation,” as it can “respond intelligently to stakeholder requirements at multiple levels and deliver the best possible output for given contexts.” Apptek, in essence, is developing responsive machine translation. 

The responsive machine translation model seems to be the logical stepping stone to further developments in MT. While not completely reformational, the responsive MT model drastically improves the current MT status quo with its context-driven translation at the segment and document level. And translating into languages where metadata is critical—Japanese with its honorifics, English with its varieties, Chinese with its dialects—will become all the easier, bridging the gap between cultures and people. 


If you liked this article, consider visiting sdts.pro and asking for a free quote on your next translation or localization project. Sprok DTS stays up to date on the latest technological and global developments in the language industry so that we can utilize the power of MT and human translation to deliver the best translation and localization experience.





The Death and Rise of the Modern Translator

With his 1967 essay, “The Death of the Author,” theorist Roland Barthes dealt a hefty blow to the authoritarian grip authors had over literature. “Literature is that neuter, that composite, that oblique into which every subject escapes, the trap where all identity is lost, beginning with the very identity of the body that writes,” Barthes posits, effectively severing the once-unbreakable tie between author and work. 

The modern translator faces a similar identity crisis. In the past decades—thanks to the advent and rise of artificial intelligence and machine translations—visionaries and technological prophets have been predicting the death of translators worldwide. We talk about one prominent visionary in another article: the Dutch language expert Jaap van der Meer, who equates the rise of artificial intelligence with the eradication of human translators. With all this talk of robots replacing humans, translators are on edge, fearing for their vocations as if, any minute now, they will hear news of the invention of a universal translation machine that will render them useless for good.

But as many readers realize, “The Death of the Author” wasn’t intended to oust authors from their tenured positions or undercut their role and prowess in any way. If anything, Barthes’ essay sheds light on the curious phenomenon of literary works taking on life after creation: a life beyond the author’s intentions. By severing the tie between writer and opus, Barthes gives a newfound purpose and power to authors; writers are no longer mere storytellers or transcribers of thoughts—rather, they are creators, like a deity, blowing life into soulless words. 

The decades-long debate over whether human translators are necessary or obsolete can also be understood in a similar way. The notion of the death of the translator stems from a long discourse of machine translation, its rise to stardom, and seemingly infinite capabilities. As justified translators are in fear for their livelihoods, it’s one thing to understand machine translation developments as a threat and another to think of MT as a reformative piece of technology to enrich human translation. 

The integration of machines in translation is a recent phenomenon. But as we speak, translators all over the world employ machine translations (or forms of it, such as computer-aided translation, postediting, etc.) to help with more accurate, more efficient translations. While thinkers and futurists might ramble on about singularity, perfect artificial intelligence, and human parity, translators are doing the hard work of incorporating new technology to aid in their human, lived experiences translating. While theorists preach the death of the translator, real human translators are already grappling with what it means to live in a hyper-technological world, negotiating for themselves a hybrid workplace alongside machines. 

Such is the defense of human translation in a purely ideological discourse. One look at the real world presents an overwhelmingly positive view of the human translator. Here are some of the real-world current news in which human translation is necessary and important. 


  1. Translators in Big Tech

Slator, the popular news source for all things related to the language industry, recently published an article with job openings at Silicon Valley companies—jobs for translators, linguists, and localization specialists. “Machine translation (MT) and adjacent language technologies are driving industry-wide demand for natural language processing (NLP) engineers and machine learning researchers. But that trend is not the whole story. The same companies are also aggressively hiring qualified linguists,” says Seyma Albarino, a staff writer for Slator. 

According to Albarino, there is an empty vacuum between traditionally human roles (customer support, QA) and technical roles (MT, etc.) for roles regarding language and linguistics. Some of Albarino’s findings include:

  • Apple: technical translator, localization and editorial producer
  • Meta: language manager, market specialists
  • Amazon: data linguist, product manager
  • Google: search language specialist, market responsibility specialist
  • Tencent: senior localization manager (regional), marketing operations manager (regional)

These are just a few of the numerous linguist and translator jobs available in Big Tech. Even Google, Amazon, and Meta—pioneers of the machine translation industry—still need and hire human translators and linguists, given the inaccuracy of machine translation. 


  1. New University Programs in Translation

Last year, Yale University announced the inception of the Yale Translation Initiative, whose mission is “to promote the interdisciplinary study of translation at Yale and beyond, encompassing its literary, social, political, economic, legal, technological, and medical dimensions.” The initiative is for Yale undergraduates and Ph.D. candidates, who will be allowed to partake in projects, seminars, and courses to enrich their understanding of translation as a feasible, necessary skill. “Translation has become increasingly central to the workings of the contemporary world,” states its introduction; courses are offered in a wide range of subjects, from literary translation to machine translation. 


  1. Video Streaming in a COVID-19 World

With so many people suffering from pandemic-induced isolation, video streaming has hit new highs in its popularity. Viewers can choose from a plethora of services (Netflix, Hulu, Disney Plus, HBO Max, etc.) depending on their tastes and celebrity crushes. The more global video streaming becomes, the bigger the demand for translation—especially subtitling and dubbing—gets. 

Netflix, the most popular and well-known among its competitors, revealed their investments in subtitling and dubbing in their January 20, 2022 conference, as part of their presentation on Q4 2021. Gregory K. Peters, the COO and Chief Product Officer of Netflix, announced that Netflix has “subtitled 7 million run time minutes in ’21 and dubbed 5 million run time minutes.” What’s more, Netflix is learning “how to do that better and how to make that localization more compelling to our members.”

That is a step in the right direction, especially for Netflix, which has recently come under fire for their half-hearted, inaccurate subtitle translations for the South Korean hit TV series Squid Game. The errors, semantic in nature, stems from the translator’s decision to eliminate cultural nuance in favor of a more localized translation: a decision that, in this context, is deemed wrong, but one that a computer couldn’t possibly make given its current limitations. 

This Netflix anecdote, however controversial, makes the important claim that translation is a deliberate, careful process consisting of minute, nuanced decisions. While certain translations are correct—and some are wrong—a process of deliberation is what makes human translation necessary and important. Machine translations couldn’t possibly accomplish such a feat; what is needed, then, is more monetary investment in content translation, so as to improve the quality of human translation. This leads us to the last point of this article: high-risk source texts. 


  1. Translation in High-Risk Situations

A 2021 study by researchers at the Olive View-UCLA Medical Center and the Memorial Sloan Kettering Cancer Center tested the accuracy of Google Translate in medical text translations for seven commonly spoken languages. Their findings were not surprising; the authors report that “GT [Google Translate] for discharge instructions in the ED [Emergency Department] is inconsistent between languages and should not be relied on for patient instructions.” The study is of note mainly because medical translation never quite gets the attention it deserves; the authors note that “many hospitals have no mechanism for written translation,” and therefore “ED providers resort to the use of automated translation software, such as Google Translate (GT) for patient instructions.” 

Translations in languages such as Spanish and Tagalog showed decent accuracy: Google Translate’s into-Spanish translation had a 94% accuracy rate, while Tagalog had a 90% accuracy. Korean and Chinese fared a bit worse, with 82.5% and 81.7% accuracy respectively. On the other end of the spectrum, Google’s Farsi translation received a 67.5% accuracy rate and Armenian, 55%. Of these mistranslations, some were capable of “potential harm,” up to 2% in Spanish and 8% in Chinese. 

COVID-19 has brought these issues into the spotlight; linguistic and cultural differences have hindered medical treatment in many COVID-19 cases, as can be seen in this report by researchers at Lake Erie College of Osteopathic Medicine. 


The necessity of human translators is not bound to these four particular fields, but these four points should be enough to firmly argue against the death of the translator. Translators are not dead; they are alive and critically needed in a number of industries, of which technology, literature, entertainment, and medicine are mere examples. 

The discourse against translators is not just discouraging, but also harmful. In a tech-savvy move towards mechanization, companies have begun to rely on lazy machine translation and underfunded human translators to cut costs and time when, instead, translators should be nurtured, trained, and paid in full for their efforts at bridging gaps in communication. These are more important questions and issues to be considered, much more important than whether or not computers will replace translators in future. 


Here at Sprok, we deliver precise, professional translation and localization services to our customers. Our translators and localization experts work tirelessly to serve your needs in 72 languages. Ask for a free quote today on our website, sdts.pro.





Jaap van der Meer and the Future of Translation

The Dutch have always been keen on networking, whether inland with their interconnected canals, or overseas in their swashbuckling trade voyages to the East. These are broad historical generalizations, of course, but serve as a metaphor for Jaap van der Meer, a visionary Dutch linguist who carries on that legacy of networking, envisioning a world coming together through the power of translation and data collection. 

Born in The Hague in 1954, van der Meer attended the University of Amsterdam, where he majored in literature and linguistics. In 1980, van der Meer started his first translation company, INK, which developed translation memory and terminology lookup software; 25 years later, he founded TAUS, a think tank and language data network that offers the largest industry-shared repository of data in language engineering. Nearly two decades after that, van der Meer is now revered as an innovator, pioneer, and visionary in the contemporary language industry. 


So what’s his deal, and what kind of future does he advocate? On the TAUS website is van der Meer’s brief and succinct manifesto for the future of translation—“Reconfiguring the Translation Ecosystem”. “A reconfiguration of the translation system is inevitable and is in fact already in full swing,” van der Meer starts, and launches into introducing what he believes to be the future of the translation industry. 

Traditional translation and localization is the work of a creative, he argues, and the most expensive and time consuming; a general estimate of the cost of human translation is between 100,000 to 150,000 euros per 1 million words. With the advent of free MT platforms, the need for human creativity and intervention is on the decline; in its stead are markets for MT models and data, which offers near-human translations at a fraction of the prices. “In the transition phase that we are in now, AI technology is being molded into existing processes, which in turn leads to a deglamorized and devalued role for the human translator,” he claims. What we should be focusing on, instead, is devising better models for AI-powered translation and feeding it clean data.

Van der Meer repeats his argument in a 2021 article for the language-industry magazine MultiLingual titled “Translation Economics of the 2020s,” where he delves deeper into current developments in machine translation—as well as the language industry in general—and makes a prediction about the industry’s near future. 

An timeline of technological advancements. Jaap van der Meer, “Translation Economics of the 2020s”

One of the contentious predictions he makes is regarding the “mixed economy” condition of the translation industry, referring to the coexistence of human translators and free, near-zero-cost translation machines. “Once the right infrastructure is in place,” he writes, “the production of a new translation costs nearly nothing and capacity becomes infinite.” Van der Meer goes on to claim that the mixed economic model will no longer be sustainable in the future; machines will replace human translators, in line with the general current towards singularity. 

His ideas are not completely ungrounded, to be fair. Van der Meer cites a number of instances in which technological advancements have put to rout entire industries and businesses: Kodak (beat out by Sony’s digital cameras), Blockbuster (thanks to Netflix and streaming media), and the taxi industry (fighting against Uber and Lyft). The same goes for translation, he argues. “In 2019, Google alone translated 300 trillion words compared to an estimated 200 billion words translated by the professional translation industry,” he writes, “by 2025, enterprises will see 75% of the work of translators shift from creating translations to reviewing and editing machine translation output.”

The focus of the language industry, then, is no longer the quality and commerce of human translation, but rather the development and upkeep of translation machines. In this new paradigm, humans are no longer the most valuable resource. It’s data: data required to feed and train translation machines to perfection and human parity. It’s only sensible, van der Meer seems to argue, that the translation industry undergoes this paradigm shift, alongside numerous other industries facing similar reconfigurations. In a word where machines slowly work towards matching human capacity, data is king. And with data comes concerns over copyright: who has claim to source text? The translatum?

An overview of the modern translation pipeline. Jaap van der Meer, “Translation Economics of the 2020s”

Van der Meer’s predictions of a reconfigured, completely MT-powered future has been criticized by professional researchers and translators. Alan Melby of the International Federation of Translators (FIT) and Christopher Kurz, Head of Translation Management at ENERCON, responded to van der Meer in a follow-up article aptly titled “Data: Of course! MT: Useful or Risky. Translators: Here to Stay!” in which they argue for the necessity of human translators in upholding the rigorous standards of translation. 

Their vision of the future is much more hopeful for translators: “We believe that the current mixed economic model is not only sustainable but beneficial to society,” writes Melby and Kurz, “consequently, we believe that there is definitely a future for professional human translators.” To prove their point, the authors rebut van der Meer’s arguments in detail. 

The first flaw in van der Meer’s vision is his definition of data; Melby and Kurz point out that van der Meer is too vague about what kind of data he’s exactly dealing with. “It is not clear which data type is the focus of [van der Meer’s] article,” they say, “it also confusingly labels metadata as “translation data.” We reject this label for metadata.” For Melby and Kurz, there are numerous types of data with different usages in different contexts; in that sense, van der Meer’s article can only be construed as vague and nonspecific in its stance to how it will deal with data. 

The complicated nature of data leads to another refutation: that “translation cannot always be “zero cost.”” Given the numerous types of data (co-text, XLIFF, TMX, metadata, their subsets, etc.), human intervention is necessary to upkeep, maintain, categorize, and clean up data necessary to fuel translation machines. 

Another main argument posited by Melby and Kurz is that computers are simply not capable—and will take a long time, if not ever, to be capable—of “understanding” context in a document. “A system can be trained on massive amounts of data and produce impressive results without understanding language,” they remark, but point out that “[these results] have not brought us closer to an understanding of how humans process language.” In other words, machine translations are still, at best, mere text processors, incapable of understanding. And for this reason, machine translation could not possibly replace human translation in the short time span van der Meer claims. 

The Hans-Christian Boos Pyramid: a model of machine-learning processes. Alan K. Melby and Christopher Kurz, “Data: Of Course! MT: Useful or Risky. Translators: Here to Stay!”

Because of their lack of intelligence and true understanding, computers cannot be trusted to take on the nuances and complexities of translation in cases where “errors in the translation can cause damage, injury, or harm.” Human translation is not a “creative” act as van der Meer claims; if anything, “creativity that ignores agreed-upon requirements is unwanted in the majority of today’s professional translation industry.” Melby and Kurz’s idea of the human translator is one of rigid rules and strict standards—“fulfilling the production phase’s requirements.” And here is the crux of Melby and Kurz’s article: “humans can check their own behavior in the translation process and verify their translations against specification.” They doubt that machine translation systems can do the same. 

When asked about the criticism and debate his original article stirred up, van der Meer retorts that “people are locked up in their here and now, and they don’t see what’s really happening with the world.” Van der Meer sees the world as rapidly changing—at breakneck speed—given how fast the world has changed in the last decade or two. He places his faith in the evocative, seemingly limitless power of technology to innovate and reconfigure the language industry, and it sounds good, too—to think of a world in which words, sentences, and paragraphs are translated, nuanced and delicate, in the blink of an eye. “We humans have to outsmart the machines, which means we shouldn’t become slaves to them and do the stupid work of correcting their output.”

A pie chart of how much work machine translations can handle. Melby and Kurz, “Data: Of Course! MT: Useful or Risky. Translators: Here to Stay!”

But for Melby and Kurz, the “stupid work” is what translation entails. Translators have a duty to provide the best, most accurate translations for their clients and customers; if tedious post-editing is what it takes to do that, then that is what translators must do. In Melby and Kurz’s eyes, van der Meer is an idealist, obsessed with the nobleness of human vocations. Van der Meer’s utopia is one where humans don’t have to lift a finger to get work done; for Melby and Kurz, such a utopia destroys all raison d’être and poses no solutions for a pre-singularity time. 

Who is right? Only time will tell. Van der Meer is visionary and futuristic, but overly idealistic and vague. Melby and Kurz are experienced and professional, but nostalgic. Their differences come down to their belief in technology. And technology, as we all know, has impressed us when we least expected and let us down in our moments of need. 




Major Academic Breakthroughs in Eliminating NLP Gender Bias

We have recently blogged about the dangers of NLP in replicating gender bias present within natural languages. The scope of the last blog was confined to major breakthroughs by Google, but there has been much research examining gender bias in NLP done by academic research groups. These researches are remarkable in their reach and originality as well, and we wish to highlight major academic research in the field today. 

Christine Basta of the Universitat Politècnica de Catalunya explains the history of wrestling with gender bias within academic research. She traces the origin of the discourse to 2016, when Bolukbasi et al., composed of members from Boston University and Microsoft Research, published a paper on gender bias in word embeddings. The paper is notable for defining gender bias as “its projection on the gender direction,” or in other words, “the more the projection is, the more biased the word is.” Here is an example of words with extreme bias projections:

Bolukbasi et al. try to mitigate the bias by shortening the distance of the projection, effectively neutralizing the bias of these aforementioned words. In a 2019 paper, however, Hila Gonen and Yoav Goldberg of Bar-Ilan University reveal that simple removal methods are ineffective. Debiasing is a superficial method, they posit, noting that there is a “profound association between gendered words and stereotypes, which was not removed by the debiasing techniques.”

But progress is hardly linear; let’s backtrack to 2017, when researchers from the University of Virginia and the University of Washington came up with the RBA (reducing bias amplification) method, which puts “constraints on structured prediction to ensure that the model predictions [of images] follow the same distribution in the training data.” Constraints at the corpus-level are effective in reducing gender bias amplification, but as was said by Gonen and Goldberg, such constraints are only artificial and superficial in dealing with the unbudging association between gendered words and stereotypes. 

There are other researches that accomplish similar purposes; researchers from the University of Washington have also carried out evaluations of gender bias in machine translation in 2019. The year 2018 also saw interesting research using coreferences to examine stereotyping behavior in machines. In the end, however, these researches only do well to identify problematic behavior and suggest mitigating methods that, quite frankly, don’t serve as ultimate solutions to the problem of gender bias in machine translation. 

But there is still hope yet. Gonen and Goldberg’s 2019 research is particularly fascinating, as it reveals a “profound association between gendered words and stereotypes.” While all research inherently relates itself to the outside world, Gonen and Goldberg suggest that changes in algorithms will never be enough to eradicate gender bias in language. If anything, the world must change alongside machines. Machines are biased only because the natural languages of the world are biased. 

Particularly fascinating is a recent 2021 paper on the current state of gender bias in MT, written by researchers from the University of Trento and the Fondazione Bruno Kessler: a paper that argues for a “unified framework,” facilitating future research. The authors note that the study of gender bias in MT is a relatively new field and briefly summarize previous analyses in the field—an important and necessary job. 

This paper is more remarkable, however, for the way it reveals the very necessary connection MT has with society. “To confront bias in MT, it is vital to reach out to other disciplines that foregrounded how the socio-cultural notions of gender interact with language(s), translation, and implicit biases. Only then can we discuss the multiple factors that concur to encode and amplify gender inequalities in language technology,” writes the authors. In other words, MT research must inevitably grapple with social structures and ideology to make for a wholly unbiased translation machine. 

The authors end their paper with a few hopeful directions MT could take: model debiasing, non-textual modalities, thinking beyond gender dichotomies, and more representation in the research process. These are all feasible, manageable methods and steps that make for a more nuanced and equal language model, safe from the biases that plague and corrupt human language. 

The applications of this are manifold; aside from gender bias, there remains ethnic and racial bias, class bias (in the economic sense), political bias, among numerous others. Similar methodologies can be applied to eradicate these biases from MT models. Until then, human translators have the duty to ensure that no nuance is lost in translation and that their work is not affected by the biases that often encumber and muddle language. 

Here at SDTS, our translators and localization experts are attentive to the ways in which machine translation iterates bias; we make sure our solutions are bias-free and inclusive. If you’re looking for a translation and localization service, give SDTS a try: our team of translators and localization experts ensure that your translations are of the utmost integrity and accuracy. 




Lifelong Learning Systems: the Frontier of AI-Powered Machine Translation

As of now, there isn’t much that separates humans from machines. But the few differences that remain are big discrepancies that have to do with the lived experience of humans, as well as the ontological differences that markedly differentiate us from our creations. These are things like free will (arguably), understanding, or emotion. Save for these human characteristics, machines might outpace us by far—they’re capable of running calculations and all sorts of incredible tasks at speeds no human would dare to dream of.

Another main difference between a human and a machine is that the former retains with them a (nearly) lifelong account of their memories. While memory does thin out the further back we recount, our capacity to retain memory is comparatively better than that of a machine system. This is particularly an issue with modern machines; neural network-based models, according to UPC’s Magdalena Biesialska, “learn in isolation, and are not able to effectively learn new information without forgetting previously acquired knowledge.” As the standard base model for modern translation engines, neural network-based models lie at the heart of the quest for better machine translation.

This apparent debilitation in neural network machines has prompted—in the last decade or so—a study into machine memories. Specifically, researchers are keen to discover methods through which neural network machines can more efficiently access previous knowledge, much like how a human would. Here is the timeline of the history of lifelong learning research:

Image credit: Magdalena Biesialska, “Major Breakthroughs in Lifelong Learning

The research into lifelong machine learning is to emulate human learning in machines, human learning being a continuous effort to learn and adapt to new environments and utilizing this accumulated knowledge to future problems. In other words, machines, with enough development in this area, “should be able to discover new tasks and learn on the job in open environments in a self-supervised manner.” UIC professor Bing Liu emphasizes the necessity of such lifelong learning; without it, Liu says, “AI systems will probably never be truly intelligent.”

This brings us to 2016 when Zhihong Li and Derek Hoiem of the University of Illinois at Urbana-Champaign published one of the first works in the field of LLL in the context of deep learning. Li and Hoiem introduce the concept of Learning without Forgetting (LwF), which “uses only new task data to train the network while preserving the original capabilities.” Previously learned knowledge is “distilled” (AKA distillation loss) to maintain performance, but at the same time, prior training data is not necessary. Biesialska explains it in simpler words: “first the model freezes the parameters of old tasks and trains solely the new ones. Afterward, all network parameters are trained jointly.” This way, the neural network is less likely to suffer from amnesia—catastrophic forgetting, in computer lingo—but LwF is limited in the kinds of new tasks the machine can learn. 

Image credit: Li and Hoiem, “Learning without Forgetting

The same year, UK-based AI research lab DeepMind and the bioengineering department of Imperial College London collaborated on coming up with a new solution to catastrophic forgetting. Their approach is labeled Elastic Weight Consolidation; unlike LwF, which utilizes knowledge distillation to prevent catastrophic forgetting, EWC “remembers old tasks by selectively slowing down learning on the weights important for those tasks.” Parameters are constrained for previously learned algorithms that perform similar operations to that of the new task, mimicking human synaptic consolidation. 

A year later, in 2017, David Lopez-Paz and Marc’Aurelio Ranzato from the Facebook Artificial Intelligence Research published their research on Gradient Episodic Memory. Unlike the previous two methods—which are considered to be regularization methods—GEM alleviates catastrophic forgetting through routine partial storage of data from past tasks. Not only is their research GEM important, but Lopez-Paz and Ranzato are also lauded for their threefold metrics, which evaluate the efficiency and productivity of a GEM-based machine. 

2017 also saw the introduction of generative replay in a paper published by Shin et al.; generative replay is an alternative method of storing old data in which pseudo-samples are created, stored, and utilized for future tasks. Biesialska notes that “although the GR method shows good results, it… [is] notoriously difficult to train.” 

How does all this relate to actual language learning, processing, and translation? All the previous research coalesced into a landmark research case in 2019 by members of DeepMind, titled “Episodic Memory in Lifelong Language Learning,” which combines previous approaches to lifelong learning and applies them to natural language processing. 

Image credit: d’Autumne et al., “Episodic Memory in Lifelong Language Learning

In short, this 2019 research investigates how an episodic memory model with “sparse experience replay” and “local adaptation” learns continuously and reuses previously acquired knowledge. While lifelong learning is still a budding area of interest and has yet to show any widespread usage among the general public, these kinds of research illustrate the possible developments in machine translation as neural machine translation overcomes catastrophic forgetting. 

In a survey of lifelong learning in natural language processing, Biesialska et al. point out the current limitations on developments in NLP. Machines are not yet able to work with partial data as humans do; machines struggle with systematic generalization about high-level language concepts. However, the authors remain hopeful about the future of lifelong learning in language processing; previous research has honed methodology down to a science and has thus made future developments more promising. 

These advances in machine memory are not synonymous with complete human parity; machines have a long way to go before they can think of besting us at the thinking game. But until then, machines can better help translators do their jobs. With lifelong memory and enhanced memory storage and referral processes, neural network-based models perform better “text classification and question answering.” These functions will hopefully allow the machine to approach the source text with more information and analytic functions, taking some of the burden off of the human translator.

A future with lifelong learning translation machines seems within reach. Neural machine translation will no longer be fettered by memory constraints, allowing neural networks to reach back to years and years of data to synthesize more human, contextually and historically aware translations. 




The Biggest Translation Trends of 2022: Medical Translation, DeepL, and More

Each year, we ask ourselves the same thing: what are some developments in the language industry that we should keep an eye out for? The kinds of issues and events that arise tell us much about the state and direction of the industry, and thhis year is no different; Slator has recently released their annual Language Industry M&A and Funding Report, as well as the results of a poll survey on what people deem to be the hottest language industry trend for 2022. Here’s what the internet thinks about the future of translation:

Speech-to-speech machine translation leads the way with 23.2%, followed by mega-mergers among LSPs. This particular trend is reminiscent of 2020, which saw important mega-mergers such as RWS-SDL and Acolad-Amplexor. A smaller 16.2% answered with “continued rise of machine translation,” which outpolled “localization jobs boom” and “better MT post-editing tech.” As for the question of how business is looking like, here’s what the internet responded with:

Things are looking good for most people, and it’s no coincidence; while the language industry did suffer slightly from the COVID pandemic (the industry was assessed at around $46.9 billion in 2019, which dipped in 2020), the market size grew again to $56.18 billion. Interprenet, a world-renown interpretation service, notes that the travel and tourism industry suffered heavily during the pandemic; on the flip side, other industries such as social networks, gaming, and healthcare experienced meaningful growth in these increasingly virtual COVID-19 times. With many workers and services moving to online and remote work, more budget is now allocated to localization and translation—in other words, the language industry. 

Interprenet also reveals that there are currently 640,000 linguists working worldwide, with 56,000 of them located in the United States; the US Bureau of Labor and Statistics predicts that the US employment of linguists is expected to increased by nearly 24% in the next 8 years, with “about 10,400 openings for interpreters and translators” projected each year. Considering how the streaming industry—perhaps one of the biggest beneficiaries of COVID-19—is expected to reach over $70 billion USD this year, the demand for localization specialists and translators has yet to reach its full potential in the coming years. 

The Maverick Group—an advertising and design company based in the UK—puts it in much simpler terms. According to their data, the number of people working in the translation industry has doubled between 2013 and 2020; the numbers are expected to grow by 20 percent from 2019 to 2029, which is “much faster than the average for all occupations.” There are seven particular reasons, according to the Maverick Group, as to why growth seems imminent and inevitable:

  1. Tech and Globalization
    1. As of now, 53% of internet content is written in the English language
    2. However, only 20 percent of the world population actually speaks English
  2. Videos and Podcasts
    1. Average video consumption is 84 minutes per day in 2019
    2. Cisco predicts that 82% of global internet traffic will come from video streaming and downloads by the end of this year
    3. With this rising need for video consumption, the demand for localization, subtitling, and dubbing ensues
  3. E-Learning
    1. Forbes estimates that the e-learning field will be worth around #355 billion by the year 2025, driven by COVID-19-induced remote learning
  4. Remote Working
    1. Demand for translation and interpretation is on the rise, due to COVID-19
  5. Medical Translation
    1. Given the pandemic’s global reach, the field of medical translation has become more necessary and important than ever
  6. A Growing Workforce
    1. The loss of job opportunities due to COVID-19 has forced many unemployed people into the realm of translation as a way of utilizing their language skills as a source of income
  7. Artificial Intelligence
    1. Recent developments in artificial intelligence has created much demand for language specialists capable of facilitating the integration of AI technology to linguistic work

It’s important to note that many of the Maverick Group’s predictions are based on general trends in the industry, which is to say, not specific, concrete data. That does not discredit their ideas; rather, these seven points reveal a slow yet unstoppable rise in the demand for translation and interpretation—more efficient, nuanced, and human ways of communication. 

Nuanced, accurate translation is especially necessary in the field of medical translation, which has been significantly impacted by COVID-19. In a New Yorker article by Clifford Marks, the vital yet demanding work that medical interpreters undertake is narrated in detail through the eyes of a Spanish-English translator named Lourdes Cerna, whose job as a medical interpreter has proven vital to hospitals around the US but at the cost of her own time and mental health. Marks describes her as “part of a burgeoning profession that has assumed a critical role during the pandemic.” Marks also notes that “researchers have found that, when patients do not have access to an interpreter, they are more likely to stay in the hospital longer and to be readmitted later on.”

Medical interpretation is just one of many fields of the language industry that has finally seen the light of day after decades of being underestimated and underfunded. Another field that has risen to the forefront in the past years is machine translation, which has greatly facilitated communication and paperwork in various industries. A recent bout regarding the use of machine translation in the workplace helps us understand how essential MT has become in the professional realm: 

Swiss Post, Switzerland’s national postal service — and one of the country’s largest employers — caused an uproar when it banned employees from using free online translation programs, namely Google Translate and DeepL, for their work.

Slator’s Seyma Albarino explains that the Swiss Post mandated this out of security reasons, and in lieu of these high-tech translation engines, workers were redirected to “Post Translate,” Swiss Post’s very own machine translation service.

But the reaction was largely critical, the reason being Post Translate is not as robust of a machine as DeepL or Google Translate, leading to difficulties in processing and translating documents. The reaction reveals a crucial fact in the modern workplace: translation engines are now essential for daily business, especially for companies like Swiss Post that deal with international commerce. 

These events go to show that machine translation is an indispensable cog of the modern workplace. In particular, DeepL has taken its place as one of the most widely used machine translations in business—and it consistently outranks other translation engines in its performance. With innovative network architecture, vast training data, impeccable training methodology, and sheer network size, DeepL lies at the heart of international business, used not only by monolingual workers to translate documents but also by translators and localization specialists to aid in their human translations. 

Charting the path of DeepL’s recent rise to success helps us better understand how deeply the language industry has exerted its influence in the realm of business. According to Slator’s research, DeepL is, after its 2017 release, now “deeply integrated into the business processes of companies across countries and sectors, with clients including Roche, Fujitsu, Axa, Best Buy, Nokia, Rakuten, Siemens, and Elsevier. Deutsche Bahn (DB), the world’s second largest transport company, started using DeepL three years ago… it quickly became popular and is now heavily used across the company daily.” In a 2021 review of its accomplishments, DeepL revealed that “by adding 13 new language to DeepL Translator… we reached 105 million more native speakers worldwide.” Add to that DeepL’s 1 billion user population: an impressive feat that’s changing the face of international business.

Rivaling DeepL is Google, who “currently has more than 91 research scientists specializing in Machine Translation, along with more than 400 NLP specialists and over a thousand experts in Machine Intelligence.” Compare that with Facebook’s own artificial intelligence-powered machine, which generates 20 billion translations per day on its social network, as reported by Worldcrunch. Spurred by competition and demand, these companies are innovating at a speed unrivaled by any previous advancements in linguistics, and 2022 is no different. Combining the world of language with artificial intelligence, machine translation is surely the next big thing to revolutionize global commerce and daily life. 

We must look at the big picture; the rising need for medical translation and machine translation is the natural byproduct of a larger growth in artificial intelligence technology, catalyzed by the COVID-19 pandemic that has effectively normalized remote work and global communication. Much of what we see as trends are predicated on real-life events such as the pandemic; it’s hard to say what’s to come next in this unpredictable little world of ours. Will a prolonged pandemic state give rise to further developments? Will the world’s attention shift to other sectors—perhaps aeronautical engineering, or maybe global politics—and effectively abandon machine translation development? We can take a stab at reading the future and say that, no, machine translation is a trend that’s here to stay. We have had a taste of the wonders of unaided, unsupervised translation; its possibilities are not only boundless, but crucial for humanity. 

All this isn’t to say that machine translation is perfect; in fact, machine translation is nowhere near it. For proper translation to take place, there must be additional human oversight to reassess machine outputs; translation machines are not yet capable of understanding context and repetition, as well as noticing deliberate word choices and usage. That is why translation services—aided by machine translation, yes, but powered by human translation—are so critical. Human translators are still—and will be—the only ones able to provide satisfactory translations in this increasingly global, increasingly efficient world. 

We here at Sprok DTS are excited to see what 2022 brings to the language industry. We utilize the best technology and the best minds to bring our clients the best, most polished translations possible, and to do so, we are constantly on the lookout for technological insights that might possibly help our translators work better, faster, and more accurately. 





Taking New Directions in Human-Centered Machine Translation

It is a truth universally acknowledged that machine translation can be bad at times. And when it happens, we ask ourselves why it has to be so inaccurate. What exactly are the problems that plague machine translation—especially frontend engines such as Google Translate—and what are researchers and engineers doing to fix it?

Douglas Hofstadter, a cognitive science and comparative literature professor at IU Bloomington, talks about his distrust of Google Translate in an article for The Atlantic aptly titled “The Shallowness of Google Translate.” To give a visual explanation of the shortcomings of this beloved translation machine. Hofstadter takes a passage from a book by Austrian mathematician Karl Sigmund, written originally in German:

Nach dem verlorenen Krieg sahen es viele deutschnationale Professoren, inzwischen die Mehrheit in der Fakultät, gewissermaßen als ihre Pflicht an, die Hochschulen vor den “Ungeraden” zu bewahren; am schutzlosesten waren junge Wissenschaftler vor ihrer Habilitation. Und Wissenschaftlerinnen kamen sowieso nicht in frage; über wenig war man sich einiger.

Here is Hofstadter’s eloquent translation into English:

After the defeat, many professors with Pan-Germanistic leanings, who by that time constituted the majority of the faculty, considered it pretty much their duty to protect the institutions of higher learning from “undesirables.” The most likely to be dismissed were young scholars who had not yet earned the right to teach university classes. As for female scholars, well, they had no place in the system at all; nothing was clearer than that.

And here is the same passage, but translated by the neural machine translation engine that is Google Translate:

After the lost war, many German-National professors, meanwhile the majority in the faculty, saw themselves as their duty to keep the universities from the “odd”; Young scientists were most vulnerable before their habilitation. And scientists did not question anyway; There were few of them.

Hofstadter points out numerous errors, some of them inconsequential, others critical mistakes. For starters, the original German word Ungeraden is translated by Hofstadter as “undesirables,” in line with his historical, contextual understanding of the term; Google, on the other hand, takes the word for its literal meaning—“un-straight” or “uneven”—given that, in its database, the word was almost always translated as “odd.” 

The long German word Wissenschaftlerinnen in the last sentence is of particular note. While Hofstadter recognizes the feminizing suffix “-in” and hence translates it in its grammatically correct meaning—“female scholar”, Google misses that grammatical notation and translates it to “scientists,” which is not just grammatically incorrect but ends up misrepresenting the entire point of the paragraph. 

And with this translation—can we even call it that?—Hofstadter expresses his distrust of mainstream translation engines. “[The translation] doesn’t mean what the original means—it’s not even in the same ballpark. It just consists of English words haphazardly triggered by the German words. Is that all it takes for a piece of output to deserve the label translation?” The main problem, as Hofstadter points out, is that machines are not yet capable of understanding context, which resides primarily in the human mind—unconsciously. There is a wide-reaching net of interconnected ideas and related concepts that humans employ to synthesize whole, structured pieces of writing, and this is what’s missing from machine translation. 

It’s easy to think, given the sound structures of Google’s neatly packaged translations, that machine translation has reached human parity, but scientists beg to differ. While language processing and machine translation are heralded as the next big thing by media, the field is still in its infancy; current machine translation models are nowhere near the level media purport it to be. Here is an infographic from the Economist highlighting the surprisingly brief and recent development of machine translation: 

Image Credits: The Economist, “Finding a Voice

As evidenced by the chart above, the history of language technology is surprisingly short. It has only been half a decade since Google released a more modern neural version of Google Translate (only for eight languages then). Hofstadter believes that people think of machine translation as capable and developed more than it is, due to the ELIZA effect—the tendency to assume humanness and conscience in artificial intelligence. The more fluid and verbal a translation is, the more people believe it to be human-like in its accuracy; the content and ideas associated with the original text, however, are all but distorted and destroyed. “The [machine translation] engine isn’t reading anything,” Hofstadter clarifies, “not in the normal human sense of the verb “to read.” It’s processing text.” 

The challenge now is not to perfect an already capable machine, but rather, to lead current research into new directions that will bridge the gap between what people deem machine translation to be, and what it actually is. In 2021, researchers at the University of California, Berkeley, Black in AI, and Google huddled together to examine the possible improvements to machine translation, and in doing so, point out a few directions in which machine translation could improve. 

Given the near impossibility of a successful, completely AI-powered translation model (at least, in the near future), the researchers posit that some level of human supervision is necessary for quality machine translation. Until machine translation is imbued with some sense of agency and conscious contextualization, human-centered machine translation is the best choice for users who need to have a document or writing translated into a language they do not know, and vice versa. 

In their 2021 paper, “Three Directions for the Design of Human-Centered Machine Translation,” the aforementioned researchers introduce innovative directions for further improvement in machine translation. First, implementing methods to help users craft good inputs. The best way to overcome MT shortcomings is to render the source text as compatible as possible, making translation easier to do for engines. Quantitative and qualitative evidence have revealed that “MT models perform best on simple, succinct, unambiguous text,” and having the machine first determine if an input text is easy to translate—and offer advice on how to better phrase it—would help tremendously with the quality of output translations. Of course, there are questions as to how such a system would be programmed and implemented. 

Another direction is to have MT systems “identify errors and initiate repairs without assuming proficiency in the target language.” Given access to back-translation or a bilingual dictionary, users will be able to partake in the translation process, editing the final translation to better fit their original sentiments and intentions. Finally, users will have a much better time with a machine that adjusts its level of formality and literal translation according to the user’s needs. For example, someone might “prioritize accurate translation of domain-specific terms over fluency when using MT at a doctor’s office.” 

While none of these directions are particularly groundbreaking or ingenious, they are realistic, human-centered factors to be considered as developers and researchers design new models. These directions prove to be promising improvements, hopefully helping users make do with the current machine translation available to the public. We hope to see such changes implemented in translation machines in the near future. 

However, there does remain some skepticism; for example, how much freedom do translators have with source texts? It’s not very feasible to assume that translators—or even the writers themselves—can easily alter the phrasing and word choice of the original document. Furthermore, there is always a limitation of knowledge; the average person is usually not privy to the inner workings of machine translation. These human-centered improvements to machine translation are predicated on human ability and capacity, although these aren’t always guaranteed. The authors sum this problem up with the question: “What affordances might help users to make informed judgments about when to rely on a machine translation and when to seek alternatives?”

The authors of the article are hopeful, however, as they apply their findings to their project TranslatorBot. With the implementation of interactive translational suggestions and messages, TranslatorBot drastically improves user interface; the researchers hope that this will, too, lead to improvement in translation quality. 

An example of an MT system mediating communication by providing extra support for users, for example, by suggesting simpler input text. Image Credits: Robertson et al., “Three Directions for the Design of Human-Centered Machine Translation

To sum up, there are three directions for machine translation through which they can offer more reliable translations for the average user:

  1. helping users craft better inputs
  2. helping users improve translated outputs, and
  3. expanding interactivity and adaptivity.  

It’s important to note that these directions are targeted not to the professional translator, but rather, people who use MT on their own terms. The three directions pertain to a translation paradigm in which the original writer is at once the translator and the editor; under these new suggestions, the writer crafts better inputs, improves translated outputs, and receives more nuanced translations. 

What does this mean for professional translators? As intermediary agents, we’re always on the lookout for developments in AI that might put us out of a job, and these three directions sound suspiciously like something that will effectively take our role away from us. But for some reason, it’s hard to think of them as such. 

What the three directions aim to do—in reality—is to cultivate a writing and translation environment and foster a certain style and format of writing that will work better with translation engines. This means increased literacy in the art and function of translation, as well as more intimacy between humans and machine translation engines. A more integrated human-MT paradigm, if you will. 

Insofar as translators are concerned, we will be here until the day machine translation achieves complete human parity. Despite even the best inputs from the writer, translation issues are bound to arise with modern translation engines, and this means translators will stick around with our specialized knowledge to offer the best translations and editing we can give.

Until then, Sprok DTS is here to help you with your translation needs. We offer top-quality translations in 72 languages, fully utilizing the powers of machine translation and computer-assisted translation to ensure the most accurate, most nuanced translations available. With our top-notch private data handling policy, we make sure your translations are safe and usable for your everyday business needs. Ask for a quote today and start your journey with Sprok DTS. 





An Insider’s Guide to the History of Gender Bias in Google Translate

If you’ve ever used a translation engine online, you’ve most likely come across an error in translation. Some errors are small: grammatical mistakes, errors in word choice, etc. Some errors are graver: accidental swear words, culturally insensitive mistranslations, and so forth. But no problem has sparked as much outrage and inspired as much innovation as gender bias in machine translation.

Google Translate caused quite a stir on Twitter around this time last year, when history professor Dora Vargha posted this screenshot:

Image Credits: Dora Vargha

The input is a series of sentences in Hungarian—a gender-neutral language—translated via Google into English. What’s striking is the blatant yet familiar gender bias, the way Google attributes certain words and phrases (beautiful, washes the dishes, sews, cooks, etc.) to feminine pronouns, and others (clever, reads, teaches, makes a lot of money, etc.) to masculine pronouns. 

The picture, now retweeted 12.9K times, has drawn a considerable amount of attention and debate and marks a critical problem in machine translation: the transfer of human bias into machine translation language models. Soon after the photo went viral, Google made considerable improvements to the interface to allow for nuanced translations of gendered sentences, but Google’s been struggling with this issue for quite a while now. 

On December 6, 2018, Google Translate product manager James Kuczmarski admitted in a public announcement that Google Translate “inadvertently replicated gender biases that already existed” and promised its users “both a feminine and masculine translation for a single word.” But the initial update only provided multiple translations for four major languages (French, Italian, Portuguese, and Spanish). The process of eliminating gender bias from the model is still an ongoing battle, as evidenced by Vargha’s Hungarian translation, more than two years after Kuczmarski’s announcement. 


The actual process of eliminating gender bias is quite simple, senior software engineer Melvin Johnson explains in a follow-up article, “Providing Gender-Specific Translations in Google Translate.” There are three steps: detect gender-neutral queries, generate gender-specific translations, and check for accuracy. Johnson’s team uses Turkish as an example for a morphologically complex language, devoid of simple gender-neutral pronoun lists and thus requiring a machine-learned system. 

Hence, the first step is to determine whether an input query is masculine, feminine, or gender-neutral in its nature. For this Johnson used “state-of-the-art text classification algorithms” and trained the model on “thousands of human-rated Turkish examples.” The result of the first step is a “convolutional neural network that can accurately detect queries which require gender-specific translations.”

Once a query is classified into one of the three gender classifications, the next step is to match it with a corresponding output—but only after an intense scrutinization of the gender of the query. Johnson’s team improves on their “underlying Neural Machine Translation (NMT) system,” which produces gendered translations when requested and default translations when no gender is requested. If a query is gender-neutral, the NMT model would add a gender prefix to the translation request. 

The final step of Johnson’s update is to check for accuracy. Johnson sums up the process like so:

Putting it all together, input sentences first go through the classifier, which detects whether they’re eligible for gender-specific translations. If the classifier says “yes”, we send three requests to our enhanced NMT model—a feminine request, a masculine request and an ungendered request. Our final step takes into account all three responses and decides whether to display gender-specific translations or a single default translation.

Johnson notes that this is only the beginning to addressing gender bias in machine-translation systems. Google has a long way to go, especially for genderless languages such as Hungarian, Malay, Finnish, Swahili.


A year and a half later, Johnson returns to report that the NMT has issues in scaling; the system resulted in “low recall, failing to show gender-specific translations for up to 40% of eligible queries.” To solve this, Johnson presents a completely new paradigm to bias elimination: rewriting-based gender-specific translation, which looks something like the following.


Fast forward a year. Mid-2021, Google Translate product manager Romina Stella introduces yet another development in bias elimination: contextual gender-bias elimination. Stella uses Wikipedia biographies to develop a dataset of English-to-Spanish translations that utilize context to better identify the correct gender of subjects. The results are as seen below:

Image Credits: Romina Stella, “A Dataset for Studying Gender Bias in Translation.” Above: Translation result with the previous NMT model. Below: Translation result with the new contextual model.

There is already a noticeable difference in the accuracy of gender classification. While Stella does admit that this dataset “doesn’t aim to cover the whole problem,” this development is noteworthy in its “aims to foster progress on this challenge across the global research community.”

Coming back to Vargha’s experiment on Twitter, it’s uncanny to see how far machine translation has come in the past few decades, yet stumble on such an essential linguistic concept as that of gendered pronouns. Much of it can be attributed to how systematically biased languages are in their daily use, and how male-dominated the machine translation field is. This issue of gender bias also goes to show how language is complex and counterintuitive, so much so that we still have yet to perfect a system to classify queries into one of three—just three—gender categories (male, female, non-binary) in widespread use. 

But it’s people like Vargha, Stella, and Johnson who shed light on these shortcomings of machine translation, pointing us in the right direction. Until machine translation proves successful, human translators do the noble work of correctly identifying gendered subjects and providing nuanced, unbiased translations in an effort to veer away from sexist modes of language. 

What has your experience been like with Google Translate and other frontend translation machines? Have you spotted any instances of gender bias—or other kinds of bias—in the translations you were working with? How has it impacted your work and your thoughts about machine translation?

Here at SDTS, our translators and localization experts are attentive to the ways in which machine translation iterates bias; we make sure our solutions are bias-free and inclusive. If you’re looking for a translation and localization service, give SDTS a try: our team of translators and localization experts ensure that your translations are of the utmost integrity and accuracy.