22.2. XML Document Structure

An XML document basically consists of two parts, a prolog and a document body:

  • The prolog provides information necessary for the interpretation of the contents of the document body. It contains two optional components, and since you can omit both, the prolog itself is optional. The two components of the prolog, in the sequence in which they must appear, are as follows:

    • An XML declaration that defines the version of XML that applies to the document and may also specify the particular Unicode character encoding used in the document and whether the document is standalone or not. Either the character encoding or the standalone specification can be omitted from the XML declaration, but if they do appear, they must be in the given sequence.

    • A document type declaration specifying an external Document Type Definition (DTD) that identifies markup declarations for the elements used in the body of the document, or explicit markup declarations, or both.

  • The document body contains the data. It comprises one or more elements where each element is defined by a begin tag and an end tag. The elements in the document body define the structure of the data. There is always a single root element that contains all the other elements. All of the data within the document is contained within the elements in the document body.

Processing instructions (PI) for the document may also appear at the end of the prolog and at the end of the document body. Processing instructions are ...

Get Ivor Horton's Beginning Java™ 2, JDK™ 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.