O'Reilly logo

Unicode Demystified by Richard Gillam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

General Character Properties

Each character has a set of properties that serve to identify the character. These include the name, Unicode 1.0 name, Jamo short name, ISO 10646 comment, block, and script.

Standard Character Names

First among these properties, of course, is the character's name, which is given both in the book and in the UnicodeData.txt file. The name is always in English, and the only legal characters for the name are the 26 Latin capital letters, the 10 Western digits, and the hyphen. The name is important, as it's the primary guide to just what character is meant by the code point. The names generally follow some conventions:

  • For those characters that belong to a particular script (writing system), the script name is included ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required