O'Reilly logo

Unicode Demystified by Richard Gillam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

General Category

After the code point value and the name, the next most important property that a Unicode character has is its general category. Seven primary categories exist: letter, number, punctuation, symbol, mark, separator, and miscellaneous. Each is subdivided into additional categories.

Letters

The Unicode standard uses the term “letter” rather loosely in assigning things to this general category. Whatever counts as the basic unit of meaning in a particular writing system, whether it represents a phoneme, a syllable, or a whole word or idea, is assigned to the “letter” category. The major exception to this rule comprises marks that combine typographically with other characters, which are categorized as “marks” instead of “letters.” They ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required