i18n: The Future

01 Sep, 2024

If you have read this far, then I hope you have found something useful. From here on, I will be giving my opinion on how current innovation might help you, if you are looking to start a refactoring project like this. I will also share what I think you could develop with a system like this in place - but all of this is my own opinion, so take it with a pinch of salt.

Above all else, I would like to leave you with one clear takeaway: If you have an idea for a website or service, and want to build an application - start with your dictionary!

It is quite a natural starting point to begin thinking about your service - in terms of what you say to your users. What are your call to actions? How do you inform them to changes in their service?

Note each phrase down, give it a name - if you maintain this list methodically, and never build business logic around the phrase itself (the token name is OK) then starting translation will be simplified drastically.

Otherwise, you will have to spend a lot of time & money on refactoring your application to account for multiple languages. Ignoring any business logic which may be predicated around constructing or parsing strings which now need to work in another language, the time cost of manually refactoring everything else is enormous.

Robust planning and smart usage of tools will help, but the unavoidable fact of doing a refactor like this is that it will touch your entire application.

AI & other tooling

The current offering of AI code generation tools in 2024 performed well at aiding this refactor. Primarily you can use co-pilot to execute the mechanical typing aspect of this work, providing prompts which follow your instructions regarding the token naming convention.

I discuss the practicalities of this here but in my opinion this is a task where AI code assistance is quite useful.

Abstract Syntax Tree Parsing

Additionally, using a library like ts-morph may prove useful in completing an internationalisation refactor. By parsing your codebase as an abstract syntax tree, you can identify which nodes in this tree are strings. You have a readily accessible identifier (the node location in the tree) and can reliably pair these two facts up to produce a list of each location where a branch terminates in a string - i.e. a phrase which needs translating. Note that this approach may produce some false positives.

This list is effectively your list of all phrases which is discussed in the first chapter. Additionally, when it comes to refactoring out raw strings & replacing them calls to translate(/) you could leverage a "parse and transform" approach using the AST:

Transpile your source code into an AST using a custom compiler script
If a node is a string literal, check for this string literal in our phrase list
If we find it, replace this string literal node with a new node
The new node should be a function call which uses this strings' token name (from the phrase list) as an argument
Write this modified code back to file.

You can think of this method as a one time operation which you apply to your codebase to avoid developers having to manually refactor each file.

Automation via token names

The concept of using token names to power automation to me is quite exciting.

For each button, field or action a user can complete in your application, we now have an associated token name. For example, say adding a new user to your organisation can be done like so:

Click Settings tab
Click on Users in sidebar
Click Add new user button
Fill out form
Click Submit button

For each of these clicks, the user is clicking on some text which has a token name. Therefore we can express this user journey as a sequence of token-names:

app.home.settings
settings.user
addNewUser.button
user.firstName and other fields we need to create a user document
addNewUser.submit

We can then express a user journey (or subset of a user journey) as a set of these tokens. We can associate a sequence of tokens with some additional actions in our back end. This can form the basis for a custom workflow engine, whereby when a certain sequence of tokens are recorded, some special action is taken.

In this system, you can imagine "squashing" a sequence of commonly undertaken tasks into a single command token. For example the sequence above could be reduced to a single view, where the form is surfaced at the press of a button rather than by navigating manually.

Similarly we can imagine looking at an event log driven from these token events: each action is uniquely labelled by default, and encodes the location from which it emerged, implicitly in the name.

Conclusion

If you have read this far, then thank you! I hope you have found it useful or entertaining. I would love to hear your thoughts, if you have feedback. I'm also currently looking for new projects - i18n and otherwise - so if you would like to talk more, you can contact me here: leoc (dot) technology (at) proton (dot) me