O'Reilly logo

Unicode Demystified by Richard Gillam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

ISO 10646 and Unicode

It should be fairly clear by now that we have a real mess on our hands. Literally hundreds of different encoding standards exist, many of which are redundant, encoding the same characters but encoding them differently. Even the ISO 8859 family includes 10 different encodings of the Latin alphabet, each containing a slightly different set of letters.

As a consequence, you have to be very explicit about which encoding scheme you're using, lest the computer interpret your text as characters other than the ones you intend, garbling it in the process (log onto a Japanese Web site with an American computer, for example, and it's likely you'll see garbage rather than Japanese). About the only thing that's safely interchangeable ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required