How Unicode Non-spacing Marks Work

Three rules govern the behavior of Unicode non-spacing marks:

  1. A non-spacing mark always combines with the character that precedes it. If the backing store contains the character codes

    U+006F LATIN SMALL LETTER O
    U+0302 COMBINING DIAERESIS
    U+006F LATIN SMALL LETTER O
    

    they represent the sequence

    and not the sequence

    In other words, the diaeresis attaches to the o that precedes it.

    Unicode's designers could have gone either way with this decision. It really doesn't make much difference whether the mark attaches to the ...

Get Unicode Demystified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.