September 2002
Intermediate to advanced
896 pages
21h 3m
English
Combining characters let you attach an arbitrary number of combining marks to a base character, leading to arbitrarily long combining character sequences. As many languages permit a single base character to have at least two diacritical marks attached to it, these combinations do occur in practice. In some cases, even precomposed characters may include multiple diacriticals.
One of the crazy things about having precomposed characters and combining character sequences is that both can be used together to represent the same character. Consider the letter o with a circumflex on top and a dot beneath, which occurs in Vietnamese. This letter has five possible representations in Unicode:
U+006F LATIN SMALL LETTER O U+0302 ...
Read now
Unlock full access