
odds are that some strictly font-based approach is used. When Unicode or some other
standardized encoding is used, you are not limited to use any particular font; any font
that contains the characters will do.
Conceptually, the approach discussed here means that you implicitly define a character
by the design of a font. If you put the letter alpha (α) into the code position that is
occupied by the letter “a” in ASCII—i.e., 61 (hexadecimal)—you are in the process of
defining a character code where that position is allocated for the alpha. However, you
rely on the use of a special font, which logically corresponds to a character code con-
version.
Unicode and UTF-8
Since the range of code numbers in Unicode is very large, it is useful to have different
encodings for different purposes. Some encodings are technically very simple and effi-
cient in terms of internal data processing but wasteful in storage space. Some other
encodings aim at compactness, for efficiency in data storage and transfer. Before dis-
cussing the encodings, we will consider a general conceptual model, which is aimed at
clarifying the different meanings and level of encoding character data.
This discussion deals with Unicode encodings in general terms and in reference to
options that you have, as a user, in choosing an encoding. The technical definitions of
the encodings (i.e., how data is encoded in detail) are in Chapter 6. ...