HTML 4.0 Language Tags
Coordinating characters sets is only the first part of the challenge. Even languages that share a character set may have different rules for hyphenation, spacing, quotation marks, punctuation, and so on. In addition to character shapes (glyphs), issues such as directionality (whether the text reads left-to-right or right-to-left) and cursive joining behavior had to be taken into account as well.
This prompted a need for a system of language identification. The W3C responded by incorporating the language tags put forth in the RFC 2070 standard on internationalization.
The “LANG” Attribute
The lang
attribute can be added within any tag to specify the language of the
contained element. It can also be added within the <html>
tag to specify a language for an entire
document. The following example specifies the document’s
language as French:
<HTML LANG="fr">
It can also be used within text elements to switch to other languages within a document, for example, you can “turn on” Norwegian for just one element:
<BLOCKQUOTE lang="no">...</BLOCKQUOTE>
The value for the lang
attribute is a
two-letter language code (not the same as country codes). Table 27.1 lists the currently available language codes.
Table 27-1. Code for the Representation of Names of Languages
Code |
Country |
Code |
Country |
Code |
Country |
---|---|---|---|---|---|
aa |
Afar |
ia |
Interlingua |
rn |
Kirundi |
ab |
Abkhazian |
id |
Indonesian (formerly in) |
ro |
Romanian |
af |
Afrikaans |
ie |
Interlingue |
Russian | |
am |
Amharic |
ik |
Inupiak |
Get Web Design in a Nutshell now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.