Spoiler: by generating translations with AI
That's right, at Slite, we have begun localizing our client applications. We started with French, as it's the native language of many Sliters and a significant portion of our customers. The process involves making the app "localization-ready" by extracting all messages into a translation file and then translating these messages into the target languages.
For the translation phase, we evaluated and began implementing an automated workflow based on auto-translations provided by a Translation Management Services (TMS). However, we encountered poor quality automated translations using services like Google Translate or DeepL, with only 30% of the translations being acceptable as-is. The translations were either poor or breaking the syntax (more on this). This meant that we could not rely on this "automation" and had to review every single translation, defeating the purpose of the automated process. Additionally, we faced cumbersome and rigid developer workflows.
At this point, our AI specialists suggested trying to generate translations using GPT-4. I was skeptical about its ability to respect the intricate ICU syntax of localization messages.
I expected to have to explain the JSON file format and the ICU syntax of the messages.
However, Florian advised me to "Forget it's AI, just describe your task as if you'd ask a fellow developer."
In disbelief, I simply pasted an extract of the translation file and gave the single instruction: "This is an English translation file, give me the French version." There was no mention of JSON or the ICU formats.
To my surprise, not only was the format and syntax perfectly respected, but the quality of the translations was 95% good. This high level of quality meant that we could trust the automation and only needed to fix a few cases when we encountered them.
Encouraged by this success, I decided to push further by injecting the glossary CSV file we had prepared for the TMS and see how GPT-4 would handle it. Once again, it flawlessly handled the glossary without any description of the file format.
One issue that arose was the translation of "you" in the French version. In French, "you" can refer to either "vous" (plural or polite) or "tu" (singular or familiar), known as "vouvoiement" or "T-V distinction". Without instructions, GPT-4 was randomly using either form. To address this, we added the instruction "Please use 'vouvoiement'" to the prompt, and that resolved the issue.
Following these successful tests, the only remaining work was to package it as an automated step for our toolchain and add features such as translating only new or updated messages, error handling, and alike. This was not a significant challenge, well worth it, especially when compared to the original scenario.
Overall, I can only think of good reasons to go full AI for translations:
I wasn't a big AI believer before this; now I'm certain AI will increasingly be relied upon in the future, not just as a gadget assistant but a first class citizen of our dev toolbox.
Arnaud Rinquin is software engineer at Slite. He has lead all kind of developments since before Slite was called Slite. Buy him a drink → get a nice story from the trenches of #dev