Chapter 11. Localization

This chapter describes how to use Mozilla's internationalization (I18N) and localization (L10N) technologies to make applications usable by people around the world. Because the Mozilla community (and the Internet community in general), is global, it is vital to be able to cross language barriers by localizing your application and making it available to a wider audience.

In this chapter, you are given step-by-step instructions on how to change the visible text for your application in the XUL interface and how to handle nonstatic strings that arise from dynamic string handling in other areas of your application code.

While the basic technologies that are used are not new, Mozilla is innovating in areas such as Unicode support and quick access language pack installs. The information in this chapter about the internationalization (http://www.mozilla.org/projects/intl/index.html) and localization (http://www.mozilla.org/projects/l10n/mlp.html) projects will give you a solid foundation for what is possible in your own application.

11.1. Localization Basics

Before learning how to localize your Mozilla application, it's useful to run through some of the high-level goals and features of the Mozilla internationalization and localization projects. First, here are some definitions:

Internationalization (I18N)

The design and development of software to function in a particular locale. The shorthand term, I18N, refers to the 18 letters between the initial “i” and final “n.”

Localization (L10N)

The modification of software to meet the language of a location and the adaptation of resources, such as the user interface (UI) and documentation, for that region. L10N is an acronym for localization and refers to the 10 letters between the initial “l” and final “n.”

Locale

"A set of conventions affected or determined by human language and customs, as defined within a particular geo-political region. These conventions include (but are not necessarily limited to) the written language, formats for dates, numbers and currency, sorting orders, etc.," according to the official Mozilla document found at http://www.mozilla.org/docs/refList/i18n/.

Locale in the context of this chapter is related specifically to the display of text in the user interface. The focus will be on UI localization of XUL files and strings contained in JavaScript and C++ files, as well as the methods employed for localization.

Here are some main features of the Mozilla internationalization capabilities, which are relevant to the user front end application level:

  • Mozilla is Unicode-enabled for Latin-based languages, Cyrillic, Greek, Chinese, Japanese, and Korean. Mozilla widgets and HTML rendering can support the input and display of these languages. Unicode-enabling for other languages and character sets is an ongoing process.
  • Mozilla can be easily localized into different languages, even if not supported by the underlying operating system.
  • Most Mozilla localization work involves translating strings as entities in Document Type Definition (DTD) format and properties file format (an idea taken from Java), which are based on open standards.
  • Localization can be done once and run on Windows, Macintosh, Unix, and other platforms -- something we have come to expect from the Mozilla framework. This is a great time saver, and indeed a cost saver if you come at it from that perspective.
  • Mozilla supports BIDI, the display and input of text in a bidirectional format for such languages as Arabic and Hebrew, yet the capabilities for this in the UI were not mature when we were writing this book.
  • The UI locale DTD files use UTF-8 as the default encoding for translated items. Mozilla then maps to Unicode or non-Unicode fonts, depending on which platform you're running on or what fonts you installed in your system. You are encouraged to encode your DTD files as UTF-8 when possible.

Recalling the architecture of the XPFE toolkit described in Chapter 2, the locale component can be easily plugged in and out of the application that you are working on without impacting any other components. This functionality is ideal, for instance, for people with linguistic skills and less experience with technical issues to become involved in a Mozilla-related project.

11.1.1. For the Developer

Many available resources show you how to help localize an existing application into a specific language or to find out how to add localization support to your own application.

The Mozilla Localization Project hosts various localization teams and provides help whenever possible. The Mozilla community includes a discussion group that uses many languages to discuss Mozilla development issues. The netscape.public.mozilla.l10n and netscape.public.mozilla.i18n newsgroups are a great place to discuss these issues with other developers.

When developing an application, some words and phrases that developers like to hear (according to the Mozilla organization, at http://www.mozilla.org/projects/l10n/xul-l10n.html) are: standards compliant, simple, leveragable, portable, extensible, separable, consistent, dynamic, valid, parser friendly, invisible (part of the XUL authoring process), and efficient. The following sections will help you understand how these terms and goals impact the chosen technologies and how to use those technologies. The ultimate aim is to help you localize your application easily.

11.1.2. Files and File Formats

Here are the main file types you'll see when learning about locale and that you will use when localizing your Mozilla application. A good home for all of these resources is in the locale area of the application chrome.

DTD (.dtd)

Files containing entities that host the strings from XUL content files.

Property (.properties) or string bundles

Files containing strings that are accessed by JavaScript, C++, and possibly other scripting or component files.

RDF

RDF files are described in XML syntax, so use entities.

HTML and text

Suitable for long text, HTML and XML documents and other content that needs to be localized.

The next two sections will help you start localizing your application. The sections focus on DTD files and string bundles, which are the core formats for XUL-localizable content. Before getting started, here is a review of some general principles that might help you design and implement the locale component.

11.1.3. UI Aesthetics and Principles

To put locale in context, this section looks at some issues you may encounter when localizing your Mozilla application. Some are universal principles and others are unique to the environment. This reference is by no means exhaustive, but it contains some scenarios and tips the authors came across in their experience with locale in Mozilla.

11.1.3.1. Space management

One of the guiding principles in UI design is for your interface to not get too crowded. Although estimates are not specific, it is wise to leave about 30 percent expansion space in your window and dialogs. To achieve this flexibility, you have to ensure that the XUL window has ample space in the first place for all the widgets to fit.

More specifically, the application needs to have space for widgets to expand or contract without detracting from the overall look and feel. Intuitive use of the XUL box model (refer to Chapter 3 for more information) and correct choice of widgets goes a long way in achieving this goal.

The factors that can cause this space to be filled include using languages/character sets that are more verbose than the one that was there originally, and the users changing their font size settings. Some safeguards that have been built into Mozilla already handle this problem. Much of it is done in CSS, but other methods are available. The section “Language Quirks,” later in this chapter, outlines one of these methods.

11.1.3.2. Help system

If you choose to integrate a Help system into your application, a localizable resource will be most content. Opinions differ within technical writing circles, but having screenshots in your documents is generally not considered advantageous. For example, they can get out of date easily in the constantly evolving world of software, or they need to be retaken frequently when new features are added to the UI.

11.1.3.3. Tooltips

Tooltips are a sometimes overlooked yet valuable way of relaying information to the user. They can be used as an alternative to a help system if you are looking for something simpler. They can also expand an explanation of something that was annotated in the UI text. Sometimes text can have multiple meanings in context, and expanding it with a tooltip can clear up any confusion. In an editor or multifile browser, for example, you might have a find button. A tooltip can clear up the confusion about whether the results of the action searches in the current file or in all files.

Most XUL widgets support tooltips. Implementation is as straightforward as adding a tooltip attribute to the widget with an associated value. For it to be localizable, it must be in the form of a DTD entity.

<tab id="config" label="&config.label;" tooltip="&config.tooltip;" />

The Section 11.2.1 section, later in this chapter, provides more information on the rationale for using entities and how to insert them into XUL content.

11.1.3.4. Grammar

In any user interface, there is limited screen space. When possible, however, provide complete or near-complete sentences. These sentences are better than using text based on phrases or acronyms. They provide meaning to the translator and clearer instructions to the user.

11.1.3.5. Commenting

Commenting was mentioned before, but is worth stressing again. The translators may have not even seen the software that you are working on, but you hope that is not the case! Commenting is very useful for giving context and flagging strings that should not be commented. You can comment your HTML, XML, or DTD files by wrapping it in a <!-- comment --> block.

<!--NOTE to Translators: Do NOT change the next string -->
<!ENTITY appName.label "My Application">

Note that a bundle file uses the # notation at the beginning of each line to signify a comment.

# This text is used in the view menu for launching the page choices dialog
pageChoices=Go To...

11.1.3.6. Web resources

Localizable resources are not only strings of text that need to be translated into different languages; they are any variable information that is liable to change over the lifetime of your application. The handling of URLs is a case in point. You may have references interspersed throughout your UI that point to web resources. These references can be explicit listings or widgets that, once activated, launch a client to bring you to a certain location.

Images are another resource commonly used in documentation. A tutorial on your application may have screenshots of the UI in action. If you do use images, keep an eye out for localizable content in them.

11.2. DTD Entities

Entities in XUL work the same way as they do in any other XML application. They are used to reference data that was abstracted from the content. This process encourages reuse of data, but in the context of Mozilla's XPFE, it is used to extract visible text in interface widgets. This extraction ensures that the content can remain untouched during the localization process.

11.2.1. Inserting Entities

Example 11-1 shows how to put DTD entities into your XUL code by using attribute values for the text of a menu item (label) and the keyboard access shortcuts (accesskey). The syntax requires that an entity be placed in quotes as the value of the attribute. This is a useful example because it highlights the localization of a widget label, which is common to many widgets, and a supplementary attribute, which, in this case, is an accesskey.

Example 11-1. XUL menu with entity references for text and accesskeys
<menu label="&menuFile.label;" accesskey="&menuFile.accesskey;">
  <menupopup>
    <menuitem accesskey="&menuNew.accesskey;" label="&menuNew.label;"
        oncommand="doNew( );"/>
    <menuitem accesskey="&menuOpen.accesskey;" label="&menuOpen.label;"
        oncommand="doOpen( );"/>
    <menuseparator />
    <menuitem accesskey="&menuClose.accesskey;" label="&menuClose.label;"
        oncommand="doClose( );"/>
    <menuitem accesskey="&menuSave.accesskey;" label="&menuSave.label;"
        oncommand="doSave( )"/>
    <menuitem accesskey="&menuSaveAs.accesskey;" label="&menuSaveAs.label;"
        oncommand="doSaveAs"/>
    <menuseparator />
    <menuitem accesskey="&menuPrint.accesskey;" label="&menuPrint.label;"
        oncommand="doPrint( );"/>
    <menuseparator />
    <menuitem accesskey="&menuExit.accesskey;" label="&menuExit.label;"
  </menupopup>
</menu>

Note that each entity in Example 11-1 has a text value associated with it in the DTD entities declarations. The entity that appears on the menu is &menuFile.label;. Note that this entity mirrors the correct syntax for referencing a value, which is: &get.text;.

The entity reference (or name, in this context) must be preceded by an ampersand (&) and end with a semicolon (;). The period is optional, but conventional. Typically, the period separates the entity's element or target (menuFile) from the type of entity (label). Refer to theSection 11.4 section later in this chapter for more information on naming conventions.

For some widgets, including <description> and <label>, the entity can be placed inside the element tags, as opposed to being values of attributes.

<description>&explanation.text;</description>

Table 11-1 represents the DTD files that accompany the XUL content in Example 11-1. Two languages, English and Spanish, are separated into different files. These files have the same name as the DTD file referenced in the XUL file that contains the entities. However, each file for every different language exists in a separate locale folder. Each entry, or entity, in the DTD file has a name that matches the name referenced in the XUL and a value to be filled in for that entity. The value is enclosed in quotes. When generating these files, you will need to create the file only once and copy it to a different directory where you can replace the values in the entities. A good tool would carry out this process for you. Refer to the Localization Tools sidebar later in the chapter for more information.

Table 11-1. Entity definitions for the XUL menu

English DTD Spanish DTD
<!ENTITY menuFile.label "File">
<!ENTITY menuNew.label "New">
<!ENTITY menuOpen.label "Open...">
<!ENTITY menuClose.label "Close">
<!ENTITY menuSave.label "Save">
<!ENTITY menuSaveAs.label "Save As...">
<!ENTITY menuPrint.label "Print...">
<!ENTITY menuExit.label "Exit">
<!ENTITY menuFile.accesskey "f">
<!ENTITY menuNew.accesskey "n">
<!ENTITY menuOpen.accesskey "o">
<!ENTITY menuClose.accesskey "c">
<!ENTITY menuSave.accesskey "s">
<!ENTITY menuSaveAs.accesskey "a">
<!ENTITY menuPrint.accesskey "p">
<!ENTITY menuExit.accesskey "x">
<!ENTITY menuFile.label "Archivo">
<!ENTITY menuNew.label "Nuevo">
<!ENTITY menuOpen.label "Abrir Archivo...">
<!ENTITY menuClose.label "Cerrar">
<!ENTITY menuSave.label "Salvar">
<!ENTITY menuSaveAs.label "Salvar Como...">
<!ENTITY menuPrint.label "Imprimir...">
<!ENTITY menuExit.label "Salir">
<!ENTITY menuFile.accesskey "a">
<!ENTITY menuNew.accesskey "n">
<!ENTITY menuOpen.accesskey "o">
<!ENTITY menuClose.accesskey "c">
<!ENTITY menuSave.accesskey "s">
<!ENTITY menuSaveAs.accesskey "a">
<!ENTITY menuPrint.accesskey "i">
<!ENTITY menuExit.accesskey "r">

Figure 11-1 shows the resulting XUL menus. There can only be one value for each entity and only one language taking precedence, or appearing in the UI, at a time.

Figure 11-1. Localized menus in English and Spanish

images

This example presents only two languages, but theoretically, you can have as many languages as you require. The locale-switching mechanism and the chrome registry must determine which one should be used, which is explained later in the section “The Chrome Registry and Locale.”

11.2.2. External and Inline Entities

You may ask, how are the entities accessed? You can associate the DTD with your XUL file in two ways. The first is internally, which involves wrapping the strings in a DTD data type enclosure by using the DOCTYPE declaration.

<!DOCTYPE window [
  <!ENTITY windowTitle.label "Greetings">
  <!ENTITY fileMenu.label "File">
]>

The second is an external DTD file, which is associated with your XUL that also uses the DOCTYPE declaration, and a reference pointing to the file:

<!DOCTYPE window SYSTEM "chrome://xfly/locale/xfly.dtd">

The node referenced in the DOCTYPE declaration is usually followed by the XUL document's root node. In this case, it is window, but can be other elements like page or dialog (however, it is not actually validated so it can be any value).

If you have a small application, the DTD files can reside in the same folder as your XUL files, but putting them into their own locale directory within your chrome structure is good practice.

Consider the main Editor window in Mozilla. Its declaration in Example 11-2 is flexible enough to associate multiple DTD files with your content.

Example 11-2. The Editor's Doctype definitions
<!DOCTYPE window [
  <!ENTITY % editorDTD SYSTEM "chrome://editor/locale/editor.dtd" >
  %editorDTD;
  <!ENTITY % editorOverlayDTD SYSTEM "chrome://editor/locale/editorOverlay.dtd" >
  %editorOverlayDTD;
  <!ENTITY % brandDTD SYSTEM "chrome://global/locale/brand.dtd" >
  %brandDTD;
]>

The declaration first stores the document associated with the chrome URL in an associated parameter entity. It then simply uses it. XML does not have a one-step way of storing and using the entity as in other languages. In other words, the declaration is the equivalent of the import foo in Python, or #include "foo.h" in C.

Certain localizable resources lend themselves to reuse. It makes sense to use the same strings across different content, which explains the inclusion of a DTD file in more than one XUL document. In Mozilla, this includes brand information, build ID numbers, and help resources.

Which is more appropriate to use: internal or external entities? Using the external approach is preferable because the content (XUL) does not have to be touched during the translation process. If someone opts to create a tool to extract and/or insert strings, their job would be much easier if they had to parse one less file type. This may remove context somewhat, but it can be overcome by actively commenting the DTD file.

11.3. String Bundles

String bundles are flat text files that contain text for the UI that is accessed in JavaScript, C++, and theoretically any language that fits within the Mozilla framework. These bundles are strings that can be presented visually to the user via some functionality in the application at any time. This may be anything from a dynamically changing menu item to an alert box, or from a URL to a placeholder that is filled depending on the context in which it is accessed. The bundle files are given an extension of .properties and they commonly reside in the locale directory with the DTD files.

A user interface can use one or more string bundles, each of which is defined in a <stringbundle> element and surrounded by a <stringbundleset> element. Example 11-3 contains the bundles used by the Mozilla browser.

Example 11-3. String bundles used by the Mozilla browser
<stringbundleset id="stringbundleset">
    <stringbundle id="bundle_navigator"
        src="chrome://navigator/locale/navigator.properties"/>
    <stringbundle id="bundle_brand"
        src="chrome://global/locale/brand.properties"/>
    <stringbundle id="bundle_navigator_region"
        src="chrome://navigator-region/locale/region.properties"/>
    <stringbundle id="bundle_brand_region"
        src="chrome://global-region/locale/region.properties"/>
    <stringbundle id="findBundle"
        src="chrome://global/locale/finddialog.properties"/>
</stringbundleset>

As you can see from their names and their locations in the chrome, each bundle serves a different purpose. They include a file that contains the bulk of the strings for the browser (navigator.properties), a file that includes branding strings, and a couple of files for regional information. This model is useful if you need to output many strings to the UI from your source code and would like to organize them into meaningful groups.

11.3.1. Inside a Bundle

A string bundle (.properties) file has a very simple format. It contains one or more lines that have the identifier associated with the localizable string. The format of a string bundle string with an identifier is:

Identifier=String

The format for comments in a bundle file requires the hash notation (#). Comments are useful for notifying translators of the context of strings, or flagging a string that should be left as is and not localized. Comments in properties files are formatted in the following manner.

# DO NOT TRANSLATE
applicationTitle=xFly

Spaces in bundles are treated literally -- spaces between words are observed, with the exception of the start and the end of the string.

The next section shows the methods and properties specific to the <stringbundle> element that are available to you when you use it. The implementations are contained in the binding for the element.

11.3.2. String Bundle Methods and Properties

Defining your bundle in XUL and then creating the file with the values is only half the story. This section shows how to extract the values from the bundle and place them in UI. The language of choice in these examples is JavaScript. This process is necessary when you have to change values in the UI because DTD entities can not be updated dynamically.

11.3.2.1. Methods

Our bundle is defined in XUL like this:

<stringbundle id="bundle_xfly"
    src="chrome://xfly/locale/xfly.properties"/>

To access the methods of the bundle object in your script, you have to get a handle on the XUL element by using its id. First declare the variable globally that will be holding the bundle:

var xFlyBundle;

Then assign the variable to the bundle. A good place to do this is in the load handler function of your XUL window, or in the constructor for your binding if you are using it from there:

xFlyBundle = document.getElementById("bundle_xfly");

Now that you have access to the bundle, you can use the available methods to retrieve the strings. The two main functions are getString and getFormattedString.

11.3.2.1.1. getString

The most straightforward string access method, getString, takes one parameter (namely the identifier of the string) and returns the localizable string value for use in the UI:

var readonly = xFlyBundle.getString('readonlyFile');
alert(readonly);

The string bundle entry looks like this:

readonlyfile=This file is read only

11.3.2.1.2. getFormattedString

This function takes an extra parameter -- an array of string values, which are substituted into the string in the bundle. Then the full string with the substituted values is returned:

var numFiles = numberInEditor
numFilesMsg = xflyBundle.getFormattedString("numFilesMessage", [numFiles]);

You can have more than one value replaced in the string, each one delimited within the square brackets by using a comma:

fileInfo = xflyBundle.getFormattedString("fileInformation",
 [fileName, fileSize]);

The string bundle entry looks like this:

flyFileInformation=The file is called %1$s and its size is %2$s

The %x numerical value refers to the ordering of the values to be substituted in the string. The type of the value is determined by the dollar ($) symbol. In this case, there are two possibilities -- $s is a string value and $d is an integer value.

11.3.2.2. Properties

Some binding properties that are exposed to your script accompany the methods. These properties are not often needed for routine retrieval of string values, but are useful to know nonetheless if you ever need to discover or share the meta information related to your bundle and locale.

11.3.2.2.1. stringBundle

This property is the string bundle object that queries the nsIStringBundleService interfaces and initializes the XPCOM interface, making methods available to it. It is the direct way of getting a string from a bundle:

var appBundle = document.getElementById("bundle_app");
return appBundle.stringBundle.GetStringFromName("chapter11");

11.3.2.2.2. src

This property is the attribute used to get and set the properties file that will be used as a string bundle:

var appBundle = document.getElementById("bundle_app");
dump("You are using the properties file " + appBundle.src);

11.3.3. Creating Your Own Bundle

The implementation for setting up your string bundle just described is hidden from the XUL author. You only need to point at the bundle you want to use by using the source attribute. There is however, an alternative way to do this if you do not favor using <stringbundle> or would like to extend that binding.

The alternative is to use utility routines that come bundled with Mozilla and are contained in a string resources JavaScript file: strres.js. With this file, creating a bundle is a three-step process.

  1. Include the JavaScript file:
<script type="application/x-javascript"
       src="chrome://global/content/strres.js"/>
  1. Set up your bundle:
var bundle =
       srGetStrBundle("chrome://mypackage/locale/mypackage.properties");
  1. Access the strings:
var greeting = bundle.GetStringFromName( "hello" );

The result retrieves the string corresponding to "hello" in your bundle file and is the equivalent of the getString call when using the XUL bundle method.

If your chrome is independent of Mozilla's chrome and you do not want to use their UI files, you can create the bundle directly by using the nsIStringBundleService XPCOM interface, as seen in Example 11-4.

Example 11-4. Creating the bundle via XPConnect
var src = 'chrome://packagexfly/content/packagebundle.properties';
var localeService =
    Components.classes["@mozilla.org/intl/nslocaleservice;1"]
    .getService(Components.interfaces.nsILocaleService);
var appLocale =  localeService.GetApplicationLocale( );
var stringBundleService =
    Components.classes["@mozilla.org/intl/stringbundle;1"]
    .getService(Components.interfaces.nsIStringBundleService);
bundle = stringBundleService.CreateBundle(src, appLocale);

The first step is to get the application locale -- the language that is currently registered with the chrome service. This is done via the nsILocalService component. The nsIStringBundleService is then initialized and the CreateBundle method is called, returning an instance of nsIStringBundle that provides access to the methods for querying strings.

11.4. Programming and Localization

This section provides little nuggets of information, not necessarily related, that show how to work around common problems when programming locale-related information in your application. It strays a little from the main path of string replacement and translation, and the topics vary from recommended naming conventions for your string identifiers to locale in XBL bindings and what tools you can use to be more productive.

11.4.1. Naming Conventions

The decision of what to call your code internals emerged more than once in this book. In Chapter 8, you decided the name of the component IDL interface IDL file and its associated implementation. In locale, it is the entity names and string identifiers contained in bundles.

Naming conventions in localization are useful because they provide some context to the translator. In this spirit, it is good for the reference to be as descriptive as possible. You can choose your route for naming or stick with the way that Mozilla does it. Examining the files in the Mozilla source base, common naming conventions for entities include the following:

id.label
id.tooltip
id.text
id.accesskey
id.commandkey

Certain XUL widgets can contain multiple localizable resources, including a text label or description, a tooltip, and an accesskey. A button is a prime example:

<button id="flyBtn" label="&flyBtn.label;" accesskey="&flyBtn.accesskey;"
    tooltip="&flyBtn.tooltip;" />

The naming convention is consistent, using the value of the id attribute appended by the name of the UI feature. The attribute and name are delimited by a period. Not only does using this value flag the resource as being associated with a certain widget, but it also permits logical grouping in the DTD:

<!ENTITY flyBtn.label "Fly Away">
<!ENTITY flyBtn.accesskey "f">
<!ENTITY flyBtn.tooltip "Click here to take to the air">

Naming string identifiers in bundle files fits less into a pattern like that in DTDs, and in the Mozilla, source files may appear random. If a pattern must be found, you could look at two things: filenames and identifier descriptions.

In a filename, the association of a single .properties file is with a logical part of the application. If a string appears in a certain dialog or window, you know where to go to translate the strings or add more strings. Example files in the Mozilla tree worth examining include editor.properties, commonDialogs.properties, and wizardManager.properties.

With identifier descriptions, the text used on the identifier describes what the text actually refers to. The goal is to be as descriptive as possible by using as brief text as possible:

dontDeleteFiles=Don't Delete Files

The descriptor is the same as the value, although in a different format. The opportunity was taken here to be as descriptive as possible.

11.4.2. Breaking Up the Text

Under certain circumstances, you may need to pop up your own alert messages as XUL dialogs. Some messages may involve multiple lines of text that need to be put on new lines. There is no natural delimiter that breaks up the text contained within <description> or <label> elements in XUL, so following are a couple of tricks to get around this problem.

11.4.2.1. Method 1: Multiple <description> elements

First, create the placeholder in your XUL where the generated elements will be inserted:

<vbox id="main-message" flex="1" style="max-width: 40em;"/>
  <!-- insert elements here -->
</vbox>

The script in Example 11-5 generates the needed text elements, fills in the text, and appends all the items to the containing box.

Example 11-5. Using multiple <description> elements
var text = window.arguments[0];
var holder = document.getElementById("main-message");
var lines = text.split("\n");
for (var i = 0; i < lines.length; i++) {
  var descriptionNode = document.createElement("description");
  var linetext = document.createTextNode(lines[i]);
  descriptionNode.appendChild(linetext);
  holder.appendChild(descriptionNode);
}

The text is passed into the window that is used for the message. It presumes that the \n delimiter is used to signify a new line in the text and is split thus. Then it loops through each line, creating a description element for each line and populating it with a text node with the message inside. Then each element is appended to the main container that lives in the XUL file.

11.4.2.2. Method 2: HTML <br> tag

For this example, create the XUL placeholder similar to the example in Method 1, and then slot the script in Example 11-6 into your load handler.

Example 11-6. Using the HTML break tag
var text = window.arguments[0];
var holder = document.getElementById("main-message");
var lines = text.split("\n");
var descriptionNode = document.createElement("description");
for (var i = 0; i < lines.length; i++) {
  var linetext = document.createTextNode(lines[i]);
  var breakNode = document.createElement("html:br");
  descriptionNode.appendChild(linetext);
  descriptionNode.appendChild(breakNode);
}
holder.appendChild(descriptionNode);

This way is similar to the code in Example 11-5, with some notable differences. First, there is only one <description> element created outside the loop for each new line. In that loop, the break occurs when an HTML <br> element is inserted after a piece of text.

With both methods, you need to put some sort of width constraint on the window at the level where you want the text to wrap. Method 1 is recommended because it is a true XUL solution, but the second method is also a good example of mixed markup in a XUL document (HTML).

11.4.3. Anonymous Content and Locale

Entities are everywhere. Well, not quite everywhere. However, as entity references and DTD constructs are part of the XML language, they can be used for localization purposes in other files in your package, such as RDF and XBL files.

In the case of XBL, it is common for binding content to inherit its locale information from the base widget. Take the Example 11-7 as a case in point. Here is the bound element in the XUL document; the binding for the bound element is shown:

<article id="artheader" class="articleheader" title="Common Garden Flies" author="Brad Buzzworth"/>

The attributes of note here are title and author, both user-defined, because they contain the localizable values that will be used in the binding.

Example 11-7. Binding with attribute inheritance
<binding id="articleheader">
  <content>
    <xul:hbox flex="1">
      <xul:label class="flybox-homeheader-text" xbl:inherits="value=title"/>
      <xul:spacer flex="1"/>
      <xul:label class="flybox-homeheader-text" xbl:inherits="value=author"/>
    </xul:hbox>
  </content>
  <implementation>
    <property name="title">
      <setter>
        <![CDATA[
          this.setAttribute('title',val); return val;
]]>
      </setter>
      <getter>
        <![CDATA[
          return this.getAttribute('title');
        ]]>
      </getter>
    </property>
    <property name="author">
      <setter>
        <![CDATA[
          this.setAttribute('author',val); return val;
        ]]>
      </setter>
      <getter>
        <![CDATA[
          return this.getAttribute('author');
        ]]>
      </getter>
    </property>
  </implementation>
</binding>

The binding in Example 11-7 illustrates a binding whose content inherits its locale from the bound element. The attributes used on the bound element, namely title and author, are descriptive, enabling the author to be specific about what they are setting a value to. The rest is taken care of in the binding, where the inherits attribute sets the value on the anonymous content to the value of the more descriptive attributes on the bound element. You can retrieve the values or set them by using the getter and setter.

Localization Tools

To translate your XUL interface strings, just change the text that corresponds to your entity reference or string bundle value. For a small application, this step should be simple, but for large applications, it can be a big task.

The good news is that tools are available to help localize your applications. The most popular tool is MozillaTranslator, which is discussed in more detail in Appendix B.

There is also a handy command line utility for Unicode conversion called nsconv, bundled in the Mozilla bin folder in any distribution. (If you are unfamiliar with Unicode, the section Section 11.6.1 later in this chapter provides more information.) Although it is broken at the time of this writing, it is worth mentioning. Let's look at a simple conversion of ASCII text to UTF-8:

<!ENTITY PrintPreviewCmd.label    "Print Preview">

Replace the string in the entity with the Spanish version:

<!ENTITY PrintPreviewCmd.label    "Presentación preliminar...">

Then run the conversion.

> nsconv -f ascii -t utf-8 foo.dtd bar.dtd

The accented characters are converted into the Unicode for you:

<!ENTITY PrintPreviewCmd.label   "Presentaci&#243;n preliminar...">

Using the NCR or CER value as well is also acceptable, if appropriate. A NCR is an entity that contains a hex (&#x61;) or decimal (&#97;) value, while a CER is also an entity containing an abbreviation (&eacute;). This assumes, though, that you know what the code is! String bundles accept only one form of encoding, which is known as escape-unicode. If using nsconv, the name for this encoding is x-u-escaped.

Various third-party conversion tools that do the same thing are available. A freeware editor called Unipad that lets you import multiple types of native encoding documents and then save as Unicode. Unipad is available from http://www.unipad.org/.

11.4.4. Localizable Resources in HTML

As a web application, Mozilla permits seamless integration of web content, both local and remote, in many formats. If you have verbose text that just needs to be displayed somewhere in the framework of your application, HTML or XML content may be ideal for this purpose. Through the use of XUL content widgets, such as <iframe> and <browser>, you have ready-made frames to slot your content into:

<iframe src="xFly.html" flex="1"/>

Therefore, a simple modification of xFly.html with a local language leaves the main application untouched. Some other uses of HTML or XML content include an "About" dialog/page, Help pages, a wizard interface, or a getting started/introduction page.

11.4.5. Localizable Resources in RDF

Strings to be converted in RDF content can take more than one form. You can use entities directly in your RDF file or have the text inline in your node descriptions. Whichever method you choose, you must ensure that the file is installed in the right place in the tree and registered correctly for the application to pick up on it.

As an XML markup, RDF can handle inline entity definitions. These entity definitions have been covered thoroughly in the chapter so far. Example 11-8 looks at localizable strings contained directly in RDF node descriptions. This example is taken from the Help table of contents in the Mozilla tree.

Example 11-8. RDF Description node with localizable text
<rdf:Description about="#nav-doc">
  <nc:subheadings>
    <rdf:Seq>
      <rdf:li>
<rdf:Description ID="nav-doc-language"
               nc:name="Language and Translation Services"
               nc:link="chrome://help/locale/nav_help.html#nav_language"/>
      </rdf:li>
    </rdf:Seq>
  </nc:subheadings>
</rdf:Description>

The text in the nc:name attribute is the text that will be changed. Note that this issue of text in RDF is separate from the topic of using RDF as the mechanism in the chrome registry to register your locale and set up a switching mechanism. This difference is addressed in the next section.

11.5. The Chrome Registry and Locale

Your application is built and you're ready to upload your shiny new Mozilla program to your server for download. The last piece of the puzzle, locale versions, has been put in place. With the structures that Mozilla has in place, it no longer has to be an afterthought. Once you have the translated files, you need to make the decision about how you want to distribute your language versions, the languages you want to make available to the users, and the level of customization that you want to give to them.

In this section, we look at how the Mozilla application suite handles the chrome's locale component. Then you see how to apply these chrome registry structures and utilities on a more generic level for your application.

11.5.1. The Directory Structure

A typical application chrome structure looks like the directory structure in Figure 11-2. A folder for each language is under the locale directory. The general format is that each language has a unique identifier based on country code and the region. This conforms to the ISO-639 two-letter code with ISO-3166 two-letter country code standards.

The W3C site has good resources that provide information about the ISO-639 and ISO-3166 standards at http://www.w3.org/International/O-HTML-tags.html.

For example, the unique identifier for Scots, Great Britain, is Sc-GB. The first code, Sc, is for the Scots (Scottish) dialect, and the second code, GB, is for the country code for Great Britain. This is the standard that Mozilla follows.

Figure 11-2. Locale's placement in typical chrome layout

images

The folder that is registered is the language folder, which is what has to be changed on an install. Thus, the URL chrome://package/locale actually points to package/locale/en-US or whichever language is turned on at the time. The language folder may in turn include subfolders that contain logical units for your application.

11.5.2. Interaction with the Chrome Registry

As pointed out in Chapter 6, your packages directories need to be registered as chrome with the chrome registry. The first step is to ensure that the entry for your package component is in the file chrome.rdf in the root chrome directory.

A resource:/ URL points to the folder for your files to be picked up and recognized by the chrome mechanism and accessed via chrome:// URLs in your application code. The locale is no exception.

<RDF:Description about="urn:mozilla:locale:en-US:xfly"
    c:baseURL="resource:/chrome/xfly/locale/en-US/">
    c:localeVersion="0.1.0.0"
  <c:package resource="urn:mozilla:package:xfly"/>
</RDF:Description>

A built-in versioning system in the chrome registry uses c:localeVersion descriptor, if you plan on distributing multiple language packs for your application. Other descriptors are available if you choose to use them: display name (c:displayName), internal name (c:name), location type (c:locType), and author (c:author).

11.5.3. Distribution

Language distribution may not be an issue for you. If, for example, your application were only going to be localized into a finite number of languages, bundling each of them up with the main installer would be most convenient. If, however, the need for new language versions arises at various intervals in the release process, you need to find a way to make them available and install them on top of an existing installation.

For example, as more people from various locations in the world are becoming aware of the Mozilla project, they want to customize it into their own language. Here are the steps that you need to take to set up your version.

  1. Register as a contributor and set up the resources that you need, if any (web page, mailing list). This will ensure that you are added to the project page on the mozilla.org site.
  2. Get a copy of Mozilla to test either via a binary distribution or by downloading and building your own source (see Appendix A for more information).
  3. Translate the files.
  4. Package your new files for distribution.
  5. Test and submit your work.

Step 4, the packaging of the new language pack, is discussed next. Mozilla's Cross-Platform Install (XPI) is the ideal candidate for achieving this packaging. This method is discussed extensively in Chapter 6. This distribution method provides great flexibility and has the benefit of being native to Mozilla, thus bypassing the search for external install technologies for your application.

11.5.3.1. The anatomy of an install script

Example 11-9 presents a script that is based on the Mozilla process that distributes localized language packs. It presumes that there is a single JAR file for the language that is installed and registered in the Mozilla binary's chrome root.

The XPI archive consists of the JAR file in a bin/chrome directory and the install.js file, together in a compressed archive with an .xpi extension. Simply clicking on a web page link to this file invokes the Mozilla software installation service and installs your language. For convenience, inline comments in Example 11-9 explain what is happening.

Example 11-9. The locale XPI install script, install.js
function verifyDiskSpace(dirPath, spaceRequired)
{
  var spaceAvailable;
  spaceAvailable = fileGetDiskSpaceAvailable(dirPath);
  spaceAvailable = parseInt(spaceAvailable / 1024);
  if(spaceAvailable < spaceRequired)
  {
    logComment("Insufficient disk space: " + dirPath);
    logComment("  required : " + spaceRequired + " K");
    logComment("  available: " + spaceAvailable + " K");
    return(false);
  }
  return(true);
}
// platform detection
function getPlatform( ) {
  var platformStr;
  var platformNode;
  if('platform' in Install) {
    platformStr = new String(Install.platform);
    if (!platformStr.search(/^Macintosh/))
      platformNode = 'mac';
    else if (!platformStr.search(/^Win/))
      platformNode = 'win';
    else
      platformNode = 'unix';
  }
  else {
    var fOSMac  = getFolder("Mac System");
    var fOSWin  = getFolder("Win System");
    logComment("fOSMac: "  + fOSMac);
    logComment("fOSWin: "  + fOSWin);
    if(fOSMac != null)
      platformNode = 'mac';
    else if(fOSWin != null)
      platformNode = 'win';
    else
      platformNode = 'unix';
  }
  return platformNode;
}
// Size in KB of JAR file
var srDest = 500;
var err;
var fProgram;
var platformNode;
platformNode = getPlatform( );
//  -- - LOCALIZATION NOTE: translate only these  -- -
// These fields are changeable in this generic script
var prettyName = "Irish";
var langcode = "ie";
var regioncode = "GA";
var chromeNode = langcode + "-" + regioncode;
//  -- - END LOCALIZABLE RESOURCES  -- -
// build the paths and file names for registry and chrome:// url access
var regName    = "locales/mozilla/" + chromeNode;
var chromeName = chromeNode + ".jar";
var regionFile = regioncode + ".jar";
var platformName = langcode + "-" + platformNode + ".jar";
var localeName = "locale/" + chromeNode + "/";
// Start the installation
err = initInstall(prettyName, regName, "0.1.0.0");
logComment("initInstall: " + err);
fProgram = getFolder("Program");
logComment("fProgram: " + fProgram);
// Check disk space using utility function at the start of the script
if (verifyDiskSpace(fProgram, srDest))
{
  err = addDirectory("",
  "bin",
  fProgram,
  "");
  logComment("addDirectory( ) returned: " + err);
  // register chrome
  var cf = getFolder(fProgram, "chrome/"+chromeName);
  var pf = getFolder(fProgram, "chrome/"+platformName);
  var rf = getFolder(fProgram, "chrome/"+regionFile);
  var chromeType = LOCALE | DELAYED_CHROME;
  registerChrome(chromeType, cf, localeName + "global/");
  registerChrome(chromeType, cf, localeName + "communicator/");
  registerChrome(chromeType, cf, localeName + "content-packs/");
  registerChrome(chromeType, cf, localeName + "cookie/");
  registerChrome(chromeType, cf, localeName + "editor/");
  registerChrome(chromeType, cf, localeName + "forms/");
  registerChrome(chromeType, cf, localeName + "help/");
  registerChrome(chromeType, cf, localeName + "messenger/");
  registerChrome(chromeType, cf, localeName + "messenger-smime/");
  registerChrome(chromeType, cf, localeName + "mozldap/");
registerChrome(chromeType, cf, localeName + "navigator/");
  registerChrome(chromeType, cf, localeName + "necko/");
  registerChrome(chromeType, cf, localeName + "pipnss/");
  registerChrome(chromeType, cf, localeName + "pippki/");
  registerChrome(chromeType, cf, localeName + "wallet/");
  registerChrome(chromeType, pf, localeName + "global-platform/");
  registerChrome(chromeType, pf, localeName + "communicator-platform/");
  registerChrome(chromeType, pf, localeName + "navigator-platform/");
  if (platformNode == "win") {
    registerChrome(chromeType, pf, localeName + "messenger-mapi/");
  }
  registerChrome(chromeType, rf, regionName + "global-region/");
  registerChrome(chromeType, rf, regionName + "communicator-region/");
  registerChrome(chromeType, rf, regionName + "editor-region/");
  registerChrome(chromeType, rf, regionName + "messenger-region/");
  registerChrome(chromeType, rf, regionName + "navigator-region/");
  if (err == SUCCESS)
  {
    // complete the installation
    err = performInstall( );
    logComment("performInstall( ) returned: " + err);
  }
  else
  {
    // cancel the installation
    cancelInstall(err);
    logComment("cancelInstall due to error: " + err);
  }
}
else
{
  // if we enter this section,
  // there is not enough disk space for installation
  cancelInstall(INSUFFICIENT_DISK_SPACE);
}

By changing some values of the changeable fields, you can tailor this script to handle the install in any directory in the chrome (cf) that you want and register the chrome URL (localeName) for use. The rest is handled by the built-in functionality in XPI provided by such functions as initInstall and performInstall.

11.5.3.2. Switching languages

The mechanism for switching languages can take many forms. Mozilla switches languages by updating an RDF datasource when a language pack is installed. The UI for switching languages in Mozilla is in the main Preferences (Edit > Preferences). Within the preferences area, the language/content panel (Appearance > Languages/Content) interacts with the chrome registry when loaded, reading in the installed language packs and populating a selectable list with the available language identifier. Selecting one language and restarting Mozilla changes the interface for the user. Example 11-10 is a simple script for switching locales.

Example 11-10. Locale-switching script
function switchLocale(langcode)
{
  try {
    var chromeRegistry = Components.classes["@mozilla.org/chrome/chrome-registry;1"].getService(Components.interfaces.nsIChromeRegistry);
    chromeRegistry.selectLocale(langcode, true);
    var observerService = Components.classes["
        @mozilla.org/observer-service;1"].
        getService(Components.interfaces.nsIObserverService);
    observerService.notifyObservers(null, "locale-selected", null);
    var prefUtilBundle = srGetStrBundle
        ("chrome://communicator/locale/pref/prefutilities.properties");
    var brandBundle = srGetStrBundle
        ("chrome://global/locale/brand.properties");
    var alertText = prefUtilBundle.GetStringFromName("languageAlert");
    var titleText = prefUtilBundle.GetStringFromName("languageTitle");
    alertText = alertText.replace(/%brand%/g,
        brandBundle.GetStringFromName("brandShortName"));
    var promptService = Components.classes["
        @mozilla.org/embedcomp/prompt-service;1"].getService( );
    promptService = promptService.QueryInterface
        (Components.interfaces.nsIPromptService)
    promptService.alert(window, titleText, alertText);
  }
  catch(e) {
    return false;
  }
  return true;
}

The language code is passed in as a parameter to the switchLocale JavaScript method in Example 11-10. The locale is set via the nsIChromeRegistry component, which uses a method named selectLocale. This locale selection is located in the first few lines, and the rest of the code prepares and shows a prompt to the user. This prompt reminds you to restart Mozilla to ensure that the new locale takes effect.

11.6. Localization Issues

This section aims to dig a little deeper into the issues of UI aesthetics and principles, in order to provide some background into the underlying encoding of documents in the XPFE framework. The main portion is taken up by a discussion of Unicode. There is some background to what Unicode is, how Mozilla uses it, and some practical conversion utilities to ensure that your files are in the correct encoding.

11.6.1. XPFE and Unicode

Unicode is a broad topic and we cannot hope to give you anywhere near a full understanding of what it is. However, a brief introduction will highlight its importance in the software world and show how it is used as one of the internationalization cornerstones in the Mozilla project.

For more in-depth information, refer to the book The Unicode Standard, Version 3.0 by the Unicode Consortium, published by Addison Wesley Longman. Another useful reference is Unicode: A Primer by Tony Graham, published by M&T Books.

Unicode is an encoding system used to represent every character with a unique number. It is a standard that came about when multiple encoding systems were merged. It became clear that keeping separate systems was hindering global communication, and applications were not able to exchange information with one another successfully. Now all major systems and applications are standardizing on Unicode. Most major operating systems, such as Windows, AIX, Solaris, and Mac OS, have already adopted it. The latest browsers, including Mozilla, support it. This quote from the Unicode Consortium (http://www.unicode.org/unicode/standard/WhatIsUnicode.html) sums it up the best:

Unicode enables a single software significant cost savings over the use of legacy character sets. Unicode enables a single software product or a single web site to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption.

There are seven character-encoding schemes in Unicode: UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, and UTF-32LE. UTF is an abbreviation for Unicode Transformation Format. The size of the character's internal representation can range from 8 bits (UTF-8) to 32 bits (UTF-32).

One of Unicode's core principles is that it be able to handle any character set and that clients supporting it provide the tools necessary to convert. This conversation can be from Unicode to native character sets and vice versa. The number of native character sets is extensive and ranges from Central European (ISO-8859-2) to Thai (TIS-620).

The default encoding of XUL, XML, and RDF documents in Mozilla is UTF-8. If no encoding is specified in the text declaration, this is the encoding that is used. In the Mozilla tree, you will usually see no encoding specified in this instance and UTF-8 is the default. To use a different encoding, you need to change the XML text declaration at the top of your file. To change your encoding to Central European, include:

<?xml version="1.0" encoding="ISO-8859-2" ?>

11.6.2. Language Quirks

The size and proportion of your windows can come into play when you know your application will be localized into more than one language. In some languages, it takes more words or characters, hence more physical space, to bring meaning to some text. This is especially the case in widgets that contain more text, such as when you want to provide usage guidelines in a panel.

One solution that Mozilla uses in at least one place is to make the actual size of the window or make the widget into a localizable entity.

<window style="&window.size;" ...>
<!ENTITY  window.size             "width: 40em; height: 40em;">

The translator or developer can anticipate the size based on the number of words or preview their changes in the displayed UI. If there is an overflow, they can overflow or do the reverse in the case of empty space.

As you begin to localize your application, especially if it is a web-related application, you will encounter words and phrases that have universal meaning and may not require translation. If you translate the whole Mozilla application, for example, you'll find that some words or phrases remain untouched. These items include terms that are used for branding, or universal web browsing terms, such as Bookmarks, Tasks, and Tools. In some instances, the choice to translate some of these terms is purely subjective.

Get Creating Applications with Mozilla now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.