3.5. Numeric properties

Some characters may be used as digits, a trivial example being '3' 0x0033 DIGIT THREE, which is part of the curriculum about halfway through nursery school. For a young reader of Tamil, this digit is written '' 0x0BE9 TAMIL DIGIT THREE, but the semantics are the same. The fact that we all have ten fingers must certainly have favored base-ten arithmetic, without regard for language, religion, or skin color.

It is interesting to know that is the number three, even if we are not readers of Tamil. For that reason, Unicode set aside three fields in UnicodeData.txt: value of decimal digit (field 6), value of digit or value of numeral (field 7), and numeric value or value of alphanumeric numeral (field 8). Once again we are baffled by the subtleties of the jargon being used: what exactly distinguishes these three fields?

Value of decimal digit is the strictest of the fields. The only characters that are "decimal digits" are those that act as—decimal digits. Thus '1' is a decimal digit, '' is a decimal digit (in Arabic), '' is a decimal digit (in Amharic), etc. These ...

Get Fonts & Encodings now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.