As we move closer and closer to shipping a Ubiquity with there is still much work to be done, particularly in the area of localization. In a recent Ubiquity meeting we laid out the explicit localization goals and non-goals of as follows:
- Goals for 0.5
- Parser 2 (on by default)
- underlying support for localization of commands
- localization of standard feed commands for a few languages
- Parser 2 language files for those same languages
- Nongoals for 0.5
- distribution/sharing of localizations
- localization of nountypes
The overall goal for this release of Ubiquity is to come up with a format and standard for localization. Localizations in Ubiquity 0.5 will only apply to commands bundled with Ubiquity, and the localization files themselves will be distributed with Ubiquity. In a future release we will tackle the problem of localizations for commands in the wild and truly croud-source1 this process.
The localization of Ubiquity commands will use a [[gettext]]-style approach where localization files list key-value pairs for different properties and messages of the commands. For Ubiquity 0.5, where we only deal with the standard command feeds bundled Ubiquity, we can simply place all the localization files in
ubiquity/standard-feeds/localization. Localization files are organized by source feed, with one localization file per source feed, per language.
The localizable components of commands will include the
help properties, as well as any localizable strings in the command’s
execute() methods. To make strings localizable in
execute(), they must be wrapped in the localize function,
Other localizable components, like
help will not need to be wrapped in the
_() function. In addition, as the localization files can only hold values of strings, for values such as names and contributors, the delimiter
| can be used to delimit multiple values.
The Localization Experience
One tool we have planned to help kickstart the localization process is a tool that will automatically create a template of strings that need localization in a user’s commands. I took a first stab at this tool today. Clicking on the “get localization template” link next to each feed in the Ubiquity command list will give you a template which you can then copy into a text file:
Additionally, instructions will later be added to this page to specify how and where to save localizations to test them or perhaps we can add a button that will automatically save it in the right location.
Localization file formats
There are two kinds of file formats for localizations we are considering:
.po, the native [[gettext]] format. As an example, here is the same key-value pair in the two formats:
# This is a comment welcomeMessage=Hello, world!
#. This is a comment (the . is actually optional) msgid "welcomeMessage" msgstr "Hello, world!"
The advantage of
The advantage of
.properties is that it is the de-facto standard in localization, particularly in the UNIX world. Lots of great tools have been written for it. The adoption of
.po could make Ubiquity localization more accessible for more people. Another advantage is that
.po files can have keys with spaces, as I note below.
If we do opt to work with
.po files, the two libraries I see out in the wild for dealing with
.po files are gettext-js (MIT) and jsgettext (LGPL). While I haven’t looked at the libraries in depth yet, so far jsgettext seems to be the winner, as some sections of gettext-js require the use of the prototype.js library.
A “key” question
In either file format, we need a unique way to refer to each localizable string—a key format. As each localization file refers to a command feed, the first collision we must avoid is the command name. With this in mind, we can come up with some trivial keys for the localizable properties: (here, consider the command
However, we run into difficulty when we try to come up with keys for the arbitrary text in
executes. For example, for a message like “Hello world!” in the preview, we could simply make the key
hello.preview.Hello world! but this may be unruly and be prone to typos. In addition, in
.properties files keys cannot have certain characters in them, like spaces, so we would have to make the key something like
hello.preview.Hello_world! or, stripping symbols and standardizing case,
Keys could also get very long with this type of key format, although here again
.po files may have an advantage as they can stay relatively more legible even with long keys. One option to deal with this would be to optionally supply a key argument to
_() so that it is used instead of the automatic key. For example, suppose the
preview() included this code:
_('This is a really long greeting message. Hello there!','longmessage')
then a localizer would only have to refer to
satyr points out that some commands use another function to incorporate similar actions and messages in both
execute(). In this case, he argues, it wouldn’t make sense to have to keep both localizations (
hello.execute.…). He suggests that optional keys (mentioned above) could be used without the
execute. infixes, as in
hello.longmessage. By taking out the
execute namespacing in the localization keys, though, it becomes the command author’s responsibility to not accidentally use strings named “names”, “help”, etc. that will have unintended consequences.
I hope that this blog post gives people an idea of the progress we’ve made in the localization area and gets people thinking about the challenges we still face. We’d love to get your feedback on the localization format and process in Ubiquity, as well as the open problems of the file format and keys.
Or “cloud-source”… finally a Japanese accent joke that’s semantically stable! ↩
This function currently also has the ability to do simple [[printf]]-formatted string replacements:
_('This is a %S.',['test'])
Whether this format will replace support for
CmdUtils.renderTemplateremains to be seen and is definitely worthy of discussion. If we move away from properties files, in particular, we may keep
renderTemplate()in lieu of the [[printf]] format. Mozilla’s built-in stringbundle handling just gave us a fast and free implementation of [[printf]]-style replacement. ↩