PropList.txt
The UnicodeData.txt file is supplemented with a variety of other files that supply information about Unicode characters. Many of these properties (e.g., East Asian width, Jamo short name, and so on) have their own files, but others are given in PropList.txt. In particular, PropList.txt lists “binary” properties—categories a character either is or isn't in. (For example, it's either a digit or it isn't.)
PropList.txt has the same format used for most of the other files. Each line represents either a single character (in which case it starts with that character's code point value) or a range of characters (in which case it starts with the starting and ending code point values in the range, separated with “..”). A semicolon and the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access