O'Reilly logo

XML Companion, The, Third Edition by Neil Bradley

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Unicode and ISO/IEC 10646

As there are far more than 256 symbols in use in the world, even ISO 8859 cannot represent them all. One obvious solution is to use more than one byte to encode each character, and two standards have emerged that use this technique. These are the Unicode and ISO/IEC 10646 standards (see www.unicode.org).

Unicode

The Unicode standard, now at version 3.0 (September 1999), was the first of these initiatives. It uses two bytes for each character, immediately raising the scope to 65,536 characters (though it actually contains just under 50,000 at the time of writing). Online charts of the characters covered can be found at www.unicode.org/charts. The number of characters of different types are listed below:

  • Alphabetics and ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required