To the previously mentioned considerations, which you have to deal with regardless of which encoding standard you use to encode your characters, Unicode adds a few more interesting complications.
Unlike in most other encoding schemes, many characters and sequences of characters have multiple legal representations in Unicode. One of the requirements of supporting Unicode is that (provided you support all of the characters involved) all representations of a character be treated as equal. Thus, whether you represent “ä” with
U+00E4 LATIN SMALL LETTER A WITH DIAERESIS
U+0061 LATIN SMALL LETTER A U+0308 COMBINING DIAERESIS
it should look and behave the same way everywhere. ...