So I wrote a helper tool to fix all of this. Now, adding a new language takes just 40 minutes and $2. It worked so well that I cleaned up the project and released it as open source.
# Key feature:
Translation into new languages happens from two source languages at once: the primary (Russian, in our case) and the secondary (English, for us). The secondary language isn’t strictly required, but highly recommended. No matter how many other languages you have, only the primary and secondary languages get sent to the LLM context for translation.
By the way, the context also includes nearby strings and a glossary (more on that below), and the prompt is designed so the LLM first comments what the string is, where it’s used, and only then translates it. This combination, based on my tests, dramatically improves translation quality.
# About translations:
- Supported formats: for now, only JSON (flat & structured) + i18next-style pluralization, but it’s easy to add new formats.
- Pluralization: supports both cardinal and ordinal forms. Example:
{ "key_one": "1 file", "key_other": "{{count}} files" }
- Placeholders: ${likeJs}, {{doubleCurve}}, {singleCurve} — you can add new formats easily. Preferred format is set per project.
- Order of strings is preserved! Important both for meaning and for LLM.
- Multiline strings: supports both \r and \n (configurable).
- String comments: you can add explanations, stored only in the app. By default, they’re generated by LLM.
- Suggested translation: you can provide a recommended translation separately (e.g., from a professional translator or AI Suggest).
- Bulk or single translation, with LLM selection per language.
- Reuse of translations: for bulk translation, already translated identical strings are reused.
- Old strings/translations aren’t deleted but kept in the DB. This partly covers branching scenarios in git, when some branches already have new translations, some don’t. Nothing gets lost.
# String validation
When we started seriously dealing with translation and localization, we quickly realized our translation files were a total mess. Not just untranslated strings, but also obsolete translations (strings deleted from the primary language), places where placeholders were replaced with string concatenation, translations where the primary (Russian) used “:”, but the secondary (English) didn’t, or line breaks existed only in one. Even cases where the primary had a placeholder but the secondary forgot it.
All these cases are now checked, and any uploaded/translated string gets a Warning flag if:
- The translation string is empty
- There are leading or trailing spaces
- String contains multiple consecutive spaces
- The translation is identical to the primary or secondary (with exceptions for email, api, ip, url, uri, id)
- A placeholder is missing that exists in the primary language
- The translation has a placeholder that doesn’t exist in the primary
- Number of line breaks (\r or \n) differs between primary and translation
- Number of colons : differs between primary and translation
- Pluralized value missing or extra for the language
- Pluralized values differ in line breaks or colons
Regardless of validation, the user can manually mark a string as verified, allowing flexible filtering and mass translation control.
Gifs and more info at GitHub: https://github.com/XAKEPEHOK/lokilizer/