Mapping Multiple Characters to Other Values

In some situations, you want to map a key that may consist of more than one character to some other value. If, for example, Unicode decomposition entails mapping single characters to variable-length strings, Unicode composition involves mapping variable-length strings to single characters. Even when you're consistently mapping single characters to single characters (or to some other fixed-length value), a UTF-16–based system (or a system that wants to save table space) might choose to treat supplementary-plane characters as pairs of surrogate code units, effectively treating single characters as “strings.”

Like mapping single characters (or other values) to variable-length strings, mapping variable-length ...

Get Unicode Demystified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.