
Unified diacritics
Similar-looking diacritic marks used in different languages and with different
meanings have generally been unified, even across scripts. Thus, the acute accent
used in French (e.g., on the “e” letters in “bébé”) is coded as the same as the acute
accent used in Polish (e.g., on the “n” letter in “Gdańsk”), even though traditional
typography for the languages uses rather different shape for the acute. The acute
accent is even unified with the Greek tonos mark (e.g., on first letter in “ώρα”),
even though it is commonly called tonos and not acute and even though its tradi-
tional shape is different from both French and Polish style. Often you do not see
differences in the shapes of a diacritic because typically each font has a uniform
design for a diacritic. However, a diacritic on a Latin letter often looks different
from the same diacritic on a non-Latin letter.
The unification applies to the diacritic as a combining mark and as a spacing char-
acter (such as acute accent U+00B4, ´) as well as any precomposed letters con-
taining the diacritic (e.g., é as used in French is coded as the same character as é
used in Hungarian).
Unification prevented by mapping considerations
Some capital letters have not been unified with each other despite similar or iden-
tical appearance, if the corresponding lowercase letters differ. For example, Latin
capital letter eth Ð and Latin ...