Software localization

Software localization is the translation of the user interface software in other languages by adapting it to the users cultures by region.

Before any software is localized, it must have been internationalized, that is to say, written to be translated.

This process is labor intensive, the tools simplify localization that is entrusted to specialized companies to reduce costs.

Software localization also relates to the fact to adapt to culture. Local characteristics are generally shared by different applications. Some specialized applications require their own parameters. The Unicode Consortium is working on a standardization of these locales, the Common Locale Data Repository project.

Difficulties

The first issue of localization is that of context phrases translated. Indeed, in the interface, the sentences are sometimes composed of several entities assembled depending on the context, which will appear separate and out of context in the translated software. For example, a sentence like

Unable to run the task x: object y is not an object consistent with z specifications.

may be composed of two separate channels to appear: “Unable to perform the task x:” and “object y is not an object consistent with z specifications.” In some cases, this may be another phrase that will appear after the colon. Furthermore, the names x, y and z are themselves stored separately and may vary. So the translator translates himself partial sentences without knowing how they will be assembled.

There is also a problem of consistency, and formatting: has the association of sentences a meaning in the target language? Should there be a space before the first or after the last character?

Most programs have “shortcuts”, that is to say, the most common commands are accessed quickly, for example by a key combination [Ctrl]+letter or [Alt]+letter; this method is not very intuitive, but makes it easier for advanced users. The letter is usually related to the command name (initial in general). Should we then use a different letter in the target language in order to keep the mnemonic character? This raises the issue of consistency (be careful that the chosen letters may have another meaning) and the possibilities of the translator: the translation interface allows him to make this operation or must change the code?

If it is not possible to change the letter, it would then indicate the mnemonic in the documentation. Furthermore, when this command is present in a menu, the letter is sometimes emphasized. If we use the same shortcut letter but not the original, you have to think to de-emphasize the initial, and possibly to emphasize the letter if it is present in another place of the word.

Regarding the translator, ideally, he must master the target language perfectly, be a speaker of the language and have studied this language, he must be familiar with the technical field in order to know the specific vocabulary and also master the software to understand the context of sentences. This poses the problem of resources allocated to translation. In all cases, the translator should be able to dialogue with developers.

In addition, some channels do not make sense but are shaping indications (return line, a variable …), so the translator must be trained to recognize these specific channels not translate, or to move it depending on context. It should also be wary of the strings that have a particular role, for example which are detected by the software and trigger a particular action. This is rather an upstream programming problem, from the perspective of internationalization, no mechanism should involve the search for a word or phrase, or at least such chains should not be “hard coded” but be put in variables to be translated in a relevant way.

A particular case is that of macro languages. A macro written with the software in a language can it be used with the software in another language? This essentially is the problem of the representation of instructions: if the instructions are translated, the word is stored in the target language, or in a code or word in the source language to be translates only when is displayed? This is the case for example for the names of functions in Microsoft Excel.

Then the interface problem: the messages do not have the same length in different languages, the translator may have to adjust the size of text boxes and the buttons.

Finally, some languages are written from left to right, others from right to left, others from top to bottom. In this context, where to position the text accompanying control (button, checkbox, slider, input box …)?

Composing specifications

Localization is facilitated by taking into account from the software design and the specifications, as well as compliance with development standards. For example:

do not use word (natural language) hardcoded but always use variables (eg a string that would be sought in a text to trigger an action, typical for words like “none“);
design the GUI to increase the spaces reserved for the text;
for attached title, put the title above the object, eliminating the need to move or resize the object based on the size of the title;

Special tools

On Linux, the po format is used to enable the realization of multilingual computer programs. A po file is a text file with the original version of the system messages together with the used translated equivalents.

Lingobit Localizer, localization tool for .NET, MFC and Java
Gettext used by most free Linux software;
Codito (Delphi projects);
Multilizer (Delphi projects).
Visual Localize (TIA) for Visual Studio .NET and Microsoft Windows project
Loc@le™ of Accent ™ Software for Microsoft Windows (apparently distributed)
Qt Linguist, software facilitating the translation programs made with the Qt library.
Lokalize, the KDE localization tool.
Po4a, for the translation of what is not code (documentation, etc.)