Other Categories
Over time, it's become necessary to draw finer distinctions between characters than the general categories let you do. It's also been noticed that some overlap exists between the general categories. Another set of categories, defined mostly in PropList.txt, has been created to capture these distinctions.
Whitespace. Many processes that operate on text treat various characters as “whitespace,” important only insofar as it separates groups of other characters from one another. In Unicode, “whitespace” can be thought of mainly as consisting of the characters in the Z (separator) categories. This is one case in which the ISO control characters have real meaning—most processes want the code points corresponding to the old ASCII and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access