
Advice to Readers
|
31
almost always provide a facility for indicating the byte order. Table 1-22 lists some encod-
ing methods that make use of wide characters, all of which are encoding methods for
Unicode.
Wide characters—encoding methodsTable 1-22.
Encoding Encoding length
UCS-2 16-bit xed
UTF-16 16-bit variable-length
UTF-32 32-bit xed
Sometimes the encodings listed in Table 1-22 are recommended to use the Byte Order
Mark (BOM) at the beginning of a le to explicitly indicate the byte order. e BOM is
covered in greater detail in Chapter 4.
It is with endianness or byte order that we can more easily distinguish multiple-byte from
wide characters. Multiple-byte characters have the same byte order, regardless of the un-
derlying processor architecture. e byte order of wide characters is determined by the
underlying processor architecture and must be agged or indicated in the data itself.
Advice to Readers
is chapter serves as an introduction to the rest of this book, and is meant to whet your
appetite for what lies ahead in the pages that follow. When reading the chapters of this
book, I suggest that you focus on the sections that cover or relate to Unicode, because they
are likely to be of immediate value and benet. Information about legacy character sets
and encodings is still of great value because it relates to Unicode, oen directly, and also
serves to chronicle how we got ...