
Blocked
In a string that begins with a starter, a character C is said to be blocked from the
starter S, if there is a character B between them such that either B is a starter or it
has a combining class value as at least as high as C’s.
Primary composite
A character is said to be a primary composite, if it has a canonical mapping and it
has not been explicitly excluded from composition by assigning the value yes (True)
to the property CE = Composition Exclusion for the character. See subsection
“Composition Exclusions later in this chapter.
We can now define that the construction of the NFC for a string consists of the fol-
lowing:
1. Construct the canonical decomposition of the string. (Note that this includes re-
ordering of consecutive nonspacing marks.)
2. Process the result by successively composing each character with the nearest pre-
ceding starter, if it is not blocked from it. Composing character C with a starter S
means that if there is a primary composite Z that is canonically equivalent to the
string consisting of S followed by C, then S is replaced by Z, and C is removed.
This is a bit complicated, so let us consider a simple example. Assume that the initial
string is U+00EA U+0323—i.e., ê followed by combining dot below. The process of
converting it to NFC is presented stepwise in Table 5-5. For clarity, the combining
diacritic marks are visualized as ^ (denoting circumflex above) ...