Definitions
in character standards assign a number to each
character. The numbers are unique in each standard, but different
standards assign the numbers differently. Some commonly used standards
are mutually compatible, in part: the numbers of characters in ASCII
(ranging from 0 to 127) are the same as in the ISO 8859 standards, and
the numbers of characters in ISO 8859-1 (ranging from 0 to 255) are
the same as in Unicode.
The numbers are nonnegative integers 0, 1, 2,…, but are
not necessarily consecutive; there can be gaps in the assignment. For
example, in ISO 8859 standards, numbers in the range 128 to 159 are
unassigned; more specifically, they are reserved for control purposes,
leaving it up to other standards to define them. Unicode contains a
lot of gaps, due to the coding structure, partly in order to leave
space for future extensions.
It might sound natural to use the first few code numbers for
digits 0, 1,…, but character standards use different
assignments. Don’t expect to find much logic in it. The code
number of a character should be treated as fairly arbitrary, but
fixed.
The number assigned to a character in a character standard has
many different names:
code number, code
position, code value, code
element, code point, code
set value, as well as simply code. In
the Unicode standard, the term “code point” is used both
about a number and about a location in the coding space where a
character could reside. Some code points are allocated for characters,
a few have been explicitly designated as not corresponding to
characters (now or ever), and most code points are still not assigned
in any way.
Since characters are internally represented by their code
numbers, a character can also be treated as an integer. In fact, many
old programming languages lack a data type for characters and use an
integer type instead. However, the code numbers are usually not used
in arithmetic operations, since they mostly lack numeric meaning. If a
character’s number is smaller than another character’s
number, this by no means implies a corresponding relation in
alphabetic order. For some small regions of code numbers, the order
actually corresponds to alphabetic order, though.