
• Hyphen-minus “-” (U+002D)—i.e., the common hyphen (as in ASCII)
This simple repertoire makes it usually rather straightforward to construct identifiers
that correspond to character names, for use in computer programs, database entries,
etc. Usually identifier syntax disallows spaces, but you can replace spaces by low line
(underscore) “_” characters without ambiguity—e.g., using COMMERCIAL_AT. The hy-
phen-minus character can be more problematic, if identifier syntax disallows it.
Digits have been avoided in Unicode names; even the digits themselves have names like
“digit zero.” Some names, however, contain digits, because they have been generated
algorithmically, by enumeration (e.g., “Greek vocal notation symbol-1”), or using the
code number as part of the name (e.g., “CJK unified ideogram-4E00”). Such names are
not very practical, and they have been included just to give every character a formal
name. Braille pattern character names contain digits that indicate the positions of dots
—e.g., “Braille pattern dots-1245.” A few names contain digits because they refer to
the shapes of digits—e.g., “double low-9 quotation mark.”
Case of letters in names
Technically, the standard defines the letters used in Unicode names as uppercase. No
ambiguity can arise, however, from using lowercase. The variation should be consid-
ered as typographical only, since the case of letters is not significant in ...