Compatibility Decompositions
Canonical composites are just one kind of compatibility character; in fact, they're only one kind of composite character. Unicode is also rife with compatibility composites, which account for 3,165 assigned code point values in Unicode 3.1. All of these characters have assigned code point values in some encoding standard in reasonably widespread use. They are characters from those standards that wouldn't have made it into Unicode on their own merits, but were given their own code point values in Unicode to allow text to be converted from the source encodings to Unicode and back again without losing any of the original information (this ability is usually referred to as “round-trip compatibility”).
A few important ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access