
CHAPTER 4
The Structure of Unicode
This chapter is an in-depth presentation of the fundamentals of Unicode, including
design principles, coding space, and special terminology. Unicode’s nature as an um-
brella standard based on a large number of older standards and its relationship to ISO
10646 will be described, examining both the unification principle and criticism of it.
However, to divide the complexity to manageable pieces, we postpone the discussion
of properties of characters (including, for example, normalization) and the Unicode
encodings to the next two chapters.
Design Principles
Here we will start from the proclaimed design principles of Unicode. Later there will
be some critical notes and considerations. We will first consider the very general, slo-
gan-like expressions of the goals, and then the more technical principles.
Goals: Universality, Efficiency, Unambiguity
The Unicode standard itself says that it was designed to be universal, efficient, and
unambiguous. These slogans have real meaning here, but it is important to analyze
what they mean and what they do not mean. Let us first see how they are presented in
the Unicode standard, and then analyze each item:
The Unicode Standard was designed to be:
Universal. The repertoire must be large enough to encompass all characters that are likely
to be used in general text interchange, including those in major international, national, ...