Mapping Multiple Characters to Other Values

In some situations, you want to map a key that may consist of more than one character to some other value. If, for example, Unicode decomposition entails mapping single characters to variable-length strings, Unicode composition involves mapping variable-length strings to single characters. Even when you're consistently mapping single characters to single characters (or to some other fixed-length value), a UTF-16–based system (or a system that wants to save table space) might choose to treat supplementary-plane characters as pairs of surrogate code units, effectively treating single characters as “strings.”

Like mapping single characters (or other values) to variable-length strings, mapping variable-length ...

Get Unicode Demystified now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.