book

XML in a Nutshell, 3rd Edition

by Elliotte Rusty Harold, W. Scott Means

September 2004

Intermediate to advanced

712 pages

24h 45m

English

O'Reilly Media, Inc.

Read now

Unlock full access

2.2.1.1. Empty elements2.2.1.2. Case-sensitivity
2.2.2.1. Parents and children2.2.2.2. The root element
3.1.2.1. Public IDs
3.3.1.1. CDATA3.3.1.2. NMTOKEN3.3.1.3. NMTOKENS3.3.1.4. Enumeration3.3.1.5. ID3.3.1.6. IDREF3.3.1.7. IDREFS3.3.1.8. ENTITY3.3.1.9. ENTITIES3.3.1.10. NOTATION
7.1.3.1. The XML declaration and processing instructions7.1.3.2. Empty elements7.1.3.3. Entity references7.1.3.4. Other unsupported features
7.2.1.1. The required href and type pseudo-attributes7.2.1.2. The media pseudo-attribute7.2.1.3. The charset pseudo-attribute7.2.1.4. The alternate and title pseudo-attributes
10.4.2.1. Multiple arcs from one arc element10.4.2.2. Arc titles10.4.2.3. Arc roles
16.1.2.1. XML as a part of the Web: REST16.1.2.2. XML for procedure calls over HTTP: XML-RPC16.1.2.3. XML envelopes and messages: SOAP16.1.2.4. Other options: BEEP and XMPP
16.2.1.1. Where and how will new documents be created?16.2.1.2. How complex will the document be?16.2.1.3. How will documents be consumed?16.2.1.4. How widely will the resulting documents be distributed?16.2.1.5. Will others need to incorporate this document structure into their own applications?
16.2.2.1. XML vocabulary development
16.2.5.1. Will instance documents need to be validated using a DTD?16.2.5.2. Will markup from this application need to be embedded in other applications?16.2.5.3. Are there legacy XML document formats to support?
17.2.2.1. The xs:documentation element17.2.2.2. The xs:appinfo element
17.2.3.1. Simple types
17.2.4.1. Attribute groups
17.6.2.1. Handling whitespace17.6.2.2. Restricting length17.6.2.3. Enumerations17.6.2.4. Numeric facets17.6.2.4.1. Minimum and maximum values17.6.2.4.2. Length and precision17.6.2.5. Enforcing format17.6.2.6. Lists17.6.2.7. Unions
17.8.1.1. Including external declarations17.8.1.2. Modifying external declarations17.8.1.3. Importing schemas for other namespaces
17.8.2.1. Deriving by extension17.8.2.2. Deriving by restriction17.8.2.3. Using derived types
17.9.4.1. Forcing uniqueness17.9.4.2. Keys and references
19.4.1.1. DocumentType19.4.1.2. ProcessingInstruction19.4.1.3. Notation19.4.1.4. Entity
19.4.2.1. Document19.4.2.2. DocumentFragment19.4.2.3. Element19.4.2.4. Attr19.4.2.5. CharacterData19.4.2.6. Comment19.4.2.7. EntityReference19.4.2.8. Text19.4.2.9. CDATASection
19.6.1.1. NameList19.6.1.2. DOMImplementationList19.6.1.3. DOMImplementationSource19.6.1.4. TypeInfo19.6.1.5. UserDataHandler19.6.1.6. DOMError19.6.1.7. DOMErrorHandler19.6.1.8. DOMLocator19.6.1.9. DOMConfiguration
21.5.1.1. Document21.5.1.2. Character range21.5.1.3. Whitespace21.5.1.4. Names and tokens21.5.1.5. Literals21.5.1.6. Character data21.5.1.7. Comments21.5.1.8. Processing instructions21.5.1.9. CDATA sections21.5.1.10. Prolog21.5.1.11. Document type definition21.5.1.12. External subset21.5.1.13. Standalone document declaration21.5.1.14. Element21.5.1.15. Start-tag21.5.1.16. End-tag21.5.1.17. Content of elements21.5.1.18. Tags for empty elements21.5.1.19. Element type declaration21.5.1.20. Element-content models21.5.1.21. Mixed-content declaration21.5.1.22. Attribute-list declaration21.5.1.23. Attribute types21.5.1.24. Enumerated attribute types21.5.1.25. Attribute defaults21.5.1.26. Conditional section21.5.1.27. Character reference21.5.1.28. Entity reference21.5.1.29. Entity declaration21.5.1.30. External entity declaration21.5.1.31. Text declaration21.5.1.32. Well-formed external parsed entity21.5.1.33. Encoding declaration21.5.1.34. Notation declarations21.5.1.35. Characters
21.6.1.1. Document21.6.1.2. Character range21.6.1.3. Whitespace21.6.1.4. Names and tokens21.6.1.5. Literals21.6.1.6. Character data21.6.1.7. Comments21.6.1.8. Processing instructions21.6.1.9. CDATA sections21.6.1.10. Prolog21.6.1.11. Document type definition21.6.1.12. External subset21.6.1.13. Standalone document declaration21.6.1.14. Element21.6.1.15. Start-tag21.6.1.16. End-tag21.6.1.17. Content of elements21.6.1.18. Tags for empty elements21.6.1.19. Element type declaration21.6.1.20. Element-content models21.6.1.21. Mixed-content declaration21.6.1.22. Attribute-list declaration21.6.1.23. Attribute types21.6.1.24. Enumerated attribute types21.6.1.25. Attribute defaults21.6.1.26. Conditional section21.6.1.27. Character reference21.6.1.28. Entity reference21.6.1.29. Entity declaration21.6.1.30. External entity declaration21.6.1.31. Text declaration21.6.1.32. Well-formed external parsed entity21.6.1.33. Encoding declaration21.6.1.34. Notation declarations
27.1.2.1. C1 controls27.1.2.2. Latin-1

Content preview from XML in a Nutshell, 3rd Edition

Character References

Unicode contains more than 96,000 different characters covering almost all of the world’s written languages. Predefining entity references for each of these characters, most of which will never be used in any one document, would impose an excessive burden on XML parsers. Rather than pick and choose which characters are worthy of being encoded as entities, XML goes to the other extreme. It predefines entity references only for characters that have special meaning as markup in an XML document: <, >, &, “, and ‘. All these are ASCII characters that are easy to type in any text editor.

For other characters that may not be accessible from an ASCII text editor, XML lets you use character references. A character reference gives the number of the particular Unicode character it stands for, in either decimal or hexadecimal. Decimal character references look like њ; hexadecimal character references have an extra x after the &#;; that is, they look like њ. Both of these references refer to the same character, њ , the Cyrillic small letter “nje” used in Serbian and Macedonian. For example, suppose you want to include the Greek maxim "σ ο φÓς ε α υ τÓ ν γ ι γ ν ω σ κ ε ι" (“The wise man knows himself”) in your XML document. However, you only have an ASCII text editor at your disposal. You can replace each Greek letter with the correct character reference, like this:

<maxim> &#x3C3;&#x3BF;&#x3C6;&#x3CC;&#x3C2; &#x3AD;&#x3B1;&#x3C5;&#x3C4;&#x3CC;&#x3BD; &#x3B3;&#x3B9;&#x3B3;&#x3BD;&#x3CE;&#x3C3;&#x3BA;&#x3B5;&#x3B9; ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

XML: Visual QuickStart Guide, Second Edition

Publisher Resources

ISBN: 0596007647Errata Page Supplemental Content

XML in a Nutshell, 3rd Edition

by Elliotte Rusty Harold, W. Scott Means

Character References

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

XML: Visual QuickStart Guide, Second Edition

Beginning XML with C# 7: XML Processing and Data Access for C# Developers

Learning XML, 2nd Edition

XML Hacks

Publisher Resources