Document Type Definitions

As discussed earlier, Document Type Definitions, or DTDs, are the form of document types specified by the XML 1.0 recommendation. Though there are alternatives, DTDs remain one of the most common ways of specifying a document type. In this section, we discuss the syntax of the various declarations that can occur in the Document Type Declaration; these can all appear in both the internal and external subsets.

Entity Declarations

Entities are sources of data that are used to compose a larger construct. Most, called general entities, are used to construct documents, but some, known as parameter entities, are used to construct the document type itself. Both are defined using an entity declaration in the Document Type Definition. Each kind of entity is defined in a separate namespace; there can be a general entity named myEntity and a parameter entity of the same name, and the names do not clash.

Entities can be declared more than once — the first definition for a name takes precedence. This allows the internal subset to override a definition provided in the external subset; when used with parameter entities, this mechanism can be used to extend DTDs. Document type extension generally works best when the DTD being extended has been carefully designed with this in mind. The DocBook DTD for technical documentation is an excellent example of this.

General entities can take a variety of forms: they may be parsed entities, consisting of XML text, or unparsed ...

Get Python & XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.