The UNIDATA Directory
The UNIDATA folder on the Unicode FTP/Web site (http://www.unicode.org/Public/UNIDATA/) is the official repository of the Unicode Character Database. Here's a quick rundown of what's in this directory:
ArabicShaping.txt groups Arabic and Syriac letters into categories depending on how they connect to their neighbors. The data in this file can be used to put together a minimally correct implementation of Arabic and Syriac character shaping.
BidiMirroring.txt is useful for implementing a rudimentary version of mirroring. For the characters whose glyphs in right-to-left text are supposed to be the mirror image of their glyphs in left-to-right text (the characters with the “mirrored” property), this file identifies those that ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access