
Canonical and compatibility equivalence
Although canonical and compatibility mappings are one-directional and do not mean
equivalence, we can define equivalence relations based on them. Canonical and com-
patibility equivalence are defined for sequences of characters (i.e., strings), naturally
regarding a single character as a special case. The exact definitions will be given later
in this chapter, but the basic idea is the following. Strings are canonical equivalent, if
their canonical decompositions, obtained by applying all canonical mappings, are the
same. Thus, in particular, if A has a canonical mapping to B, then A and B are canonical
equivalent. Compatibility equivalence is defined in a similar way, except that both
compatibility and canonical mappings are applied.
The term “canonical equivalent” is from the Unicode standard, so we
use it in this book, instead of the grammatically more correct expression
“canonically equivalent.”
The meaning of canonical mapping
We already mentioned that canonical mapping does not mean identity, despite the
symbol commonly used to denote it. A relationship like Ω U+2126 ≡ Ω U+03A9 is a
relation between two distinct characters. We should expect that programs often make
no distinction between them, but a distinction may be made.
For example, a program might recognize U+2126 but not U+03A9, or vice versa. It
would then behave differently for them, of course. ...