book

XML in a Nutshell, 3rd Edition

by Elliotte Rusty Harold, W. Scott Means

September 2004

Intermediate to advanced

712 pages

24h 45m

English

O'Reilly Media, Inc.

Read now

Unlock full access

2.2.1.1. Empty elements2.2.1.2. Case-sensitivity
2.2.2.1. Parents and children2.2.2.2. The root element
3.1.2.1. Public IDs
3.3.1.1. CDATA3.3.1.2. NMTOKEN3.3.1.3. NMTOKENS3.3.1.4. Enumeration3.3.1.5. ID3.3.1.6. IDREF3.3.1.7. IDREFS3.3.1.8. ENTITY3.3.1.9. ENTITIES3.3.1.10. NOTATION
7.1.3.1. The XML declaration and processing instructions7.1.3.2. Empty elements7.1.3.3. Entity references7.1.3.4. Other unsupported features
7.2.1.1. The required href and type pseudo-attributes7.2.1.2. The media pseudo-attribute7.2.1.3. The charset pseudo-attribute7.2.1.4. The alternate and title pseudo-attributes
10.4.2.1. Multiple arcs from one arc element10.4.2.2. Arc titles10.4.2.3. Arc roles
16.1.2.1. XML as a part of the Web: REST16.1.2.2. XML for procedure calls over HTTP: XML-RPC16.1.2.3. XML envelopes and messages: SOAP16.1.2.4. Other options: BEEP and XMPP
16.2.1.1. Where and how will new documents be created?16.2.1.2. How complex will the document be?16.2.1.3. How will documents be consumed?16.2.1.4. How widely will the resulting documents be distributed?16.2.1.5. Will others need to incorporate this document structure into their own applications?
16.2.2.1. XML vocabulary development
16.2.5.1. Will instance documents need to be validated using a DTD?16.2.5.2. Will markup from this application need to be embedded in other applications?16.2.5.3. Are there legacy XML document formats to support?
17.2.2.1. The xs:documentation element17.2.2.2. The xs:appinfo element
17.2.3.1. Simple types
17.2.4.1. Attribute groups
17.6.2.1. Handling whitespace17.6.2.2. Restricting length17.6.2.3. Enumerations17.6.2.4. Numeric facets17.6.2.4.1. Minimum and maximum values17.6.2.4.2. Length and precision17.6.2.5. Enforcing format17.6.2.6. Lists17.6.2.7. Unions
17.8.1.1. Including external declarations17.8.1.2. Modifying external declarations17.8.1.3. Importing schemas for other namespaces
17.8.2.1. Deriving by extension17.8.2.2. Deriving by restriction17.8.2.3. Using derived types
17.9.4.1. Forcing uniqueness17.9.4.2. Keys and references
19.4.1.1. DocumentType19.4.1.2. ProcessingInstruction19.4.1.3. Notation19.4.1.4. Entity
19.4.2.1. Document19.4.2.2. DocumentFragment19.4.2.3. Element19.4.2.4. Attr19.4.2.5. CharacterData19.4.2.6. Comment19.4.2.7. EntityReference19.4.2.8. Text19.4.2.9. CDATASection
19.6.1.1. NameList19.6.1.2. DOMImplementationList19.6.1.3. DOMImplementationSource19.6.1.4. TypeInfo19.6.1.5. UserDataHandler19.6.1.6. DOMError19.6.1.7. DOMErrorHandler19.6.1.8. DOMLocator19.6.1.9. DOMConfiguration
21.5.1.1. Document21.5.1.2. Character range21.5.1.3. Whitespace21.5.1.4. Names and tokens21.5.1.5. Literals21.5.1.6. Character data21.5.1.7. Comments21.5.1.8. Processing instructions21.5.1.9. CDATA sections21.5.1.10. Prolog21.5.1.11. Document type definition21.5.1.12. External subset21.5.1.13. Standalone document declaration21.5.1.14. Element21.5.1.15. Start-tag21.5.1.16. End-tag21.5.1.17. Content of elements21.5.1.18. Tags for empty elements21.5.1.19. Element type declaration21.5.1.20. Element-content models21.5.1.21. Mixed-content declaration21.5.1.22. Attribute-list declaration21.5.1.23. Attribute types21.5.1.24. Enumerated attribute types21.5.1.25. Attribute defaults21.5.1.26. Conditional section21.5.1.27. Character reference21.5.1.28. Entity reference21.5.1.29. Entity declaration21.5.1.30. External entity declaration21.5.1.31. Text declaration21.5.1.32. Well-formed external parsed entity21.5.1.33. Encoding declaration21.5.1.34. Notation declarations21.5.1.35. Characters
21.6.1.1. Document21.6.1.2. Character range21.6.1.3. Whitespace21.6.1.4. Names and tokens21.6.1.5. Literals21.6.1.6. Character data21.6.1.7. Comments21.6.1.8. Processing instructions21.6.1.9. CDATA sections21.6.1.10. Prolog21.6.1.11. Document type definition21.6.1.12. External subset21.6.1.13. Standalone document declaration21.6.1.14. Element21.6.1.15. Start-tag21.6.1.16. End-tag21.6.1.17. Content of elements21.6.1.18. Tags for empty elements21.6.1.19. Element type declaration21.6.1.20. Element-content models21.6.1.21. Mixed-content declaration21.6.1.22. Attribute-list declaration21.6.1.23. Attribute types21.6.1.24. Enumerated attribute types21.6.1.25. Attribute defaults21.6.1.26. Conditional section21.6.1.27. Character reference21.6.1.28. Entity reference21.6.1.29. Entity declaration21.6.1.30. External entity declaration21.6.1.31. Text declaration21.6.1.32. Well-formed external parsed entity21.6.1.33. Encoding declaration21.6.1.34. Notation declarations
27.1.2.1. C1 controls27.1.2.2. Latin-1

Content preview from XML in a Nutshell, 3rd Edition

Checking Documents for Well-Formedness

Every XML document, without exception, must be well-formed. This means it must adhere to a number of rules, including the following:

Every start-tag must have a matching end-tag.
Elements may nest but may not overlap.
There must be exactly one root element.
Attribute values must be quoted.
An element may not have two attributes with the same name.
Comments and processing instructions may not appear inside tags.
No unescaped < or & signs may occur in the character data of an element or attribute.

This is not an exhaustive list. There are many, many ways a document can be malformed. You’ll find a complete list in Chapter 21. Some of these involve constructs that we have not yet discussed, such as DTDs. Others are extremely unlikely to occur if you follow the examples in this chapter (for example, including whitespace between the opening < and the element name in a tag).

Whether the error is small or large, likely or unlikely, an XML parser reading a document is required to report it. It may or may not report multiple well-formedness errors it detects in the document. However, the parser is not allowed to try to fix the document and make a best-faith effort of providing what it thinks the author really meant. It can’t fill in missing quotes around attribute values, insert an omitted end-tag, or ignore the comment that’s inside a start-tag. The parser is required to return an error. The objective here is to avoid the bug-for-bug compatibility wars that ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

XML: Visual QuickStart Guide, Second Edition

Publisher Resources

ISBN: 0596007647Errata Page Supplemental Content

XML in a Nutshell, 3rd Edition

by Elliotte Rusty Harold, W. Scott Means

Checking Documents for Well-Formedness

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

XML: Visual QuickStart Guide, Second Edition

Beginning XML with C# 7: XML Processing and Data Access for C# Developers

Learning XML, 2nd Edition

XML Hacks

Publisher Resources