Titlecase Mapping (tc), 226
toLowerCase Java function, 584
tone marks, 365
Tools menu (MS Word), 96
toTitleCase Java function, 584
toUpperCase Java function, 584
trademark (™), 425
traditional writing system (Chinese), 380
transcoding, 363
tools, 146
transcriptions, 361–368
transfer encoding negotiation, 545
Transfer Encoding Syntax (TES), 139, 140
heading for, 514
Transfer-Encoding header, 514
translations, producing for the web, 542
transliterations, 361–368
troff subtype, 500
TrueType, 34
truncation, 277
TSV (Tab Separated Values), 439, 498
Turkish encodings, 152
Turkish language, 357
two-letter code for General Category values,
217
Type 1, 34
type faces, 33
type text, 497
type-map method (Apache), 539
typeface, 33
typographic discrepancies, 74
typography, 7, 384
U
U+nnnn convention, 19
U.S. International keyboards, 80
UAX (Unicode Standard Annex), 264
line break rules, 279
uc (Uppercase Mapping), 226
UCA (Unicode Collation Algorithm), 261–264
UCD (Unicode Character Database), 219
UCS (Universal Character Set), 164, 173, 301
UCS Sequence Identifiers (USI), 173
UCS-2, 304, 312
UCS-4, 303, 312
Ugar (Ugaritic) script code, 349
Uldeo (Unified Ideograph), 227
unambiguity, 157–159
unassigned code points, 183, 185, 292
underlined text, 475
underscore (_), 395
underscore (_) in Unicode database files, 229
Unibook, 193
Unicode 1 Name (na1), 224
Unicode 2.0 repertoire, 39
Unicode 4.0.1, 189
Unicode 4.1.0, 189
Unicode algorithms, 296
Unicode Case Charts, 254
Unicode Collation Algorithm (UCA), 261–264
Unicode Consortium, 164
collation charts and, 257
Unicode Encoded logo, 531
Unicode fonts, 202
Unicode Radical Stroke Count (URS), 227
Unicode scalar values, 302
Unicode Sequence Identifier (USI), 418
Unicode Standard Annex (UAX), 264
line break rules, 279
Unicode standard annexes, 298
Unicode Technical Note (UTN), 313
Unicode Technical Report (UTR), 139
normalization vs. folding, 245
Unicode Transformation Format (UTF), 301
Unicode versions, 174, 189
UnicodeBlock.of Java function, 584
Unicodedata.txt file, 196, 220
UnicodeData.txt file
canonical mapping and, 236
database files and, 229
unification, 160, 208, 370
unified diacritics, 162
Unified Ideograph (Uldeo), 227
Uniform Resource Locators (see URL)
uniformity, 159
Unihan.txt file, 225
UniPad, 115
Uniscribe (Microsoft), 44
unit symbols, 449
units of text (characters), 8
Universal Character Set (UCS), 164, 173, 301
universality, 157
Uniview, 216
update version number, 174
Upper property, 227
uppercase letters, 251
case folding and, 254
vs. lowercase, 387
Index | 677

Get Unicode Explained now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.