Chapter 7. Internationalization

If the Web is to reach a truly worldwide audience, it needs to be able to support the display of all the languages of the world, with all their unique alphabets and symbols, directionality, and specialized punctuation. This poses a big challenge to HTML constructs as we know them. However, according to the W3C, “energetic efforts” are being made toward this complicated goal.

The W3C’s efforts for internationalization (often referred to as "i18n”—an i, then 18 letters, then an n) address two primary issues. First is the handling of alternative character sets that take into account all the writing systems of the world; second is how to specify languages and their unique presentation requirements within an HTML document. Many solutions presented by internationalization experts in a document called RFC 2070 were incorporated into the current HTML 4.0, XML 1.0, and CSS2 specifications.

This chapter addresses key issues for internationalization, including character sets and new language features in HTML 4 and CSS2. Be aware that many of these features are not yet supported by browsers, even the most current.

Character Sets

The first challenge in internationalization is dealing with the staggering number of unique character shapes (called "glyphs”) that occur in the writing systems of the world. This includes not only alphabets, but also all ideographs (characters that indicate a whole word or concept) for languages such as Chinese, Japanese, and ...

Get Web Design in a Nutshell, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.