22.4. Document Type Definitions

You have seen several small examples of XML and in each case it was fairly obvious what the content was meant to represent, but where are the rules that ensure such data is represented consistently and correctly in different documents? Do the <radius> and <position> elements have to be in that sequence in a <circle> element and could you omit either of them?

Clearly there has to be a way to determine what is correct and what is incorrect for any particular element in a document. As I mentioned earlier, a Document Type Definition (DTD) defines how valid elements are constructed for a particular type of document, so the XML for purchase order documents in a company could be defined by one DTD, and sales invoice documents by another. The Document Type Definition for a document is specified in a document type declaration—commonly known as a DOCTYPE declaration—that appears in the document prolog following any XML declaration. A DTD essentially defines a vocabulary for describing data of a particular kind—the set of elements that you use to identify the data, in other words. It also defines the possible relationships between these elements—how they can be nested. The contents of a document of the type identified by a particular DTD must be defined and structured according to rules that make up the DTD. Any document of a given type can be checked for validity against its DTD.

A DTD can be an integral part of a document, but it is usually, and more usefully, ...

Get Ivor Horton's Beginning Java™ 2, JDK™ 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.