
Making Use of Character Numbers
There are several ways to use the Unicode number of a character. The methods of
writing characters will be discussed in Chapter 2, but here are some possibilities:
• In HTML and XML authoring, you can use a character reference of the form &#x
number ;—e.g., ℮. That way, you can include any character, no matter what
your keyboard is or what your document’s encoding is.
• On Microsoft software that uses the so-called Uniscribe input (e.g., many programs
under Windows XP), you can type a character’s number in hexadecimal, such as
212e, and then type Alt-X and see how the number is replaced by the character.
• You can use the number as an index to information on characters in different tables,
databases, and services, including the Unicode standard.
• You can select a character by its number in user interfaces such as the Character
Map in Windows, as illustrated earlier in Figure 1-1, or the window that opens in
Microsoft Word when you select Insert → Symbol. The latter is illustrated in Fig-
ure 1-9, which shows the window in a Finnish version of Word. As you can see,
the character name shown is still the Unicode name as such—in this case, ESTI-
MATED SYMBOL.
Encoding Characters as Octet Sequences
When we need to store character data on a computer, we might consider storing it in
an exact visual shape. Some people would call this a very naive idea, but it is ...