Skip to Content
XML in a Nutshell, 3rd Edition
book

XML in a Nutshell, 3rd Edition

by Elliotte Rusty Harold, W. Scott Means
September 2004
Intermediate to advanced
712 pages
24h 45m
English
O'Reilly Media, Inc.
Content preview from XML in a Nutshell, 3rd Edition

Chapter 27. Character Sets

By default, an XML parser assumes that XML documents are written in the UTF-8 encoding of Unicode. However, documents may be written instead in any character set the XML processor understands, provided that there’s either some external metadata like an HTTP header or internal metadata like a byte-order mark or an encoding declaration that specifies the character set. For example, a document written in the Latin-5 character set would need this XML declaration:

<?xml version="1.0" encoding="ISO-8859-9"?>

Most good XML processors understand many common character sets. The XML specification recommends the character names shown in Table 27-1. When using any of these character sets, you should use these names. Of these character sets, only UTF-8 and UTF-16 must be supported by all XML processors, although many XML processors support all character sets listed here, and many support additional character sets besides. When using character sets not listed here, you should use the names specified in the IANA character sets registry at http://www.iana.org/assignments/character-sets.

Table 27-1. Character set names defined by the XML specification

Name

Character set

UTF-8

The default encoding used in XML documents, unless an encoding declaration, byte-order mark, or external metadata specifies otherwise; a variable-width encoding of Unicode that uses one to four bytes per character. UTF-8 is designed such that all ASCII documents are legal UTF-8 documents, which is not ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

XML: Visual QuickStart Guide, Second Edition

XML: Visual QuickStart Guide, Second Edition

Kevin Howard Goldberg
XML Hacks

XML Hacks

Michael Fitzgerald

Publisher Resources

ISBN: 0596007647Errata PageSupplemental Content