
792
|
Appendix D: Glossary
character that is not considered standard on
most operating systems or environments. See
also system-specic character.
USLP
Unix System Laboratories Pacic.
USV
See Unicode Scalar Value.
UTC
Unicode Technical Committee.
UTF
Unicode (or UCS) Transformation Format. A
series of encoding forms for Unicode and ISO
10646. See also UTF-8, UTF-16, and UTF-32.
UTF-2
Obsolete. A UTF dened by AT&T Bell Labs
(Plan 9) and X/Open for encoding Unicode
text as a stream of bytes. Also called FSS-UTF
(File System Safe UTF), and now referred to as
UTF-8. See also Plan 9 and UTF-8.
UTF-7
A variation of Base64 encoding that trans-
forms Unicode encodings—UCS-2, UCS-4,
and UTF-16—into a form that can be safely
transmitted through 7-bit pathways. Most
ASCII characters represent themselves under
this encoding.
UTF-8
e 8-bit encoding form (and encoding
scheme) of Unicode and ISO 10646 that uses
up to four code units to represent a character.
Its original denition, which accommodated
UCS-4 encoding, used a variable-length one-
through six-byte encoding. Once called UTF-
2 and FFS-UTF (File System Safe UTF).
UTF-16
e 16-bit encoding form of Unicode and
ISO 10646. Uses 16-bit code units. Non-BMP
characters are encoded through the use of
High and Low Surrogates. Requires the BOM
to specify byte order. See also BOM.
UTF-16BE
A Unicode encoding scheme that represents
the big-endian ...