Appendix B. Character Sets and Entities

An XML document may contain any character from the entire Unicode character set. This character set (equivalent to the ISO/IEC 10646 standard) is a universal character set (UCS), and represents nearly all the characters for all known world languages today.

In most cases, you can simply type a character into an XML document, and it will display as expected. However, if you want to use any accents, foreign language characters, or special symbols, this may not be the case. That’s because not all computer applications store character information in the same way.

The method of storing the characters from a character set is called character encoding. And, if your XML document and the application displaying it do ...

Get XML: Visual QuickStart Guide, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.