Unicode Explained

the software you use can probably handle the mappings internally, so that the font can

be used for Unicode text as well.

Whether you can vary the font in your text depends on the tools and data formats you

use. In plain text, there is no font variation, but word processors work with other

formats. They usually have some simple tool for, for example, selecting some words

and setting their font to something different than the surrounding text.

However, some special tricks have often been used in an attempt to extend character

repertoire by font settings. In Chapter 3, we noted that you could type, on your word

processor, the letters “abc” and then select them and use the font-changing command

to set the font to Symbol to get “αβχ” (i.e., three Greek lowercase letters). We analyzed

this from the viewpoint of character encoding, but here the emphasis is on comparing

such tricks with the Unicode approach.

Logically, the Symbol font is a collection of mostly wrong glyphs for characters (e.g.,

an α glyph for “a”). Of course, the same trick works for Unicode text, too, unless the

software you use refuses to perform the illogical move. After all, the Symbol font does

not contain the letters “abc,” so any request to use it for them should be ignored.

Anyway, using Unicode, such tricks are completely unnecessary and pointlessly risky.

A change of font never changes the identity of characters, in the logical sense, so even

if you see “αβχ,” it’s still “abc.” This can be checked by changing the font to something

else. There’s no reason to take the slightest risk of having your data passed through

some process that changes the font and distorts what you meant. In Unicode, you

simply use the right characters, using some suitable input method. To help you in such

a conversion, Appendix A contains a table of Unicode equivalents of Symbol font

glyphs.

This should not be confused with font changes needed to make some correctly entered

characters visible. For example, if you use any of the methods described in Chapter 2

to enter the Greek letter alpha α, it might still fail to display properly. If the current font

does not contain a glyph for alpha, you need to change the font locally (or globally) to

something else, such as Arial Unicode MS—but any font containing the alpha will do.

Criticism of Unicode

Unicode has been criticized on several accounts, from very different perspectives. The

following discussion tries to summarize most of the arguments and comment on them.

The presentation is not apologetic; it will admit that there are good points in the criti-

cism.

Criticism of lack of tools for indicating semantic structures is not discussed here. It is

indirectly addressed in section “Why Not Markup in Unicode?” in Chapter 9.

Criticism of Unicode | 203

Get Unicode Explained now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Unicode Explained by Jukka K. Korpela