Appendix B. Character Sets and Entities

An XML document may contain any character from the entire Unicode character set. This character set (equivalent to the ISO/IEC 10646 standard) is a universal character set (UCS), and represents nearly all the characters for all known world languages today.

In most cases, you can simply type a character into an XML document, and it will display as expected. However, if you want to use any accents, foreign language characters, or special symbols, this may not be the case. That’s because not all computer applications store character information in the same way.

The method of storing the characters from a character set is called character encoding. And, if your XML document and the application displaying it do ...

Get XML: Visual QuickStart Guide, Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.