Working with DTDs

Schemas and validation play a major role in reliable application communication. Developing a firm understanding of how to express document relationships within a schema is crucial to using them effectively. In this chapter, we concentrate on DTDs, but the concepts presented here apply to all schema languages. See the discussion of alternate schema languages in Chapter 2 for pointers to Python modules that support schema languages other than the DTD language defined as part of XML 1.0.

The DTD is represented in the internal DTD subset, the external DTD subset, or the combination of the two. As the name suggests, the internal subset rides along with the XML document instance, whereas the external subset is stored as a link telling the parser where to find the DTD.

The xmlproc package is a validating parser for Python. As of this writing, it is the only validating parser available for Python that is also implemented in Python. If you have the PyXML package installed, as we assume throughout this book, you already have xmlproc available and may already use it in your programs. The xmlproc package can be imported from the xml.parsers package:

>>> from xml.parsers import xmlproc

Validating with the Internal DTD Subset

There is a good chance that if you have been working with XML for a while, you are able to easily pick up the basic syntax of DTDs just by seeing a few examples. The xmlproc package features a command-line routine called xvcmd.py. This simple utility ...

Get Python & XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.