XML Processing Tools
Python ships with XML parsing support in its standard library and plays host to a vigorous XML special-interest group. XML (eXtended Markup Language) is a tag-based markup language for describing many kinds of structured data. Among other things, it has been adopted in roles such as a standard database and Internet content representation by many companies. As an object-oriented scripting language, Python mixes remarkably well with XML’s core notion of structured document interchange.
XML is based upon a tag syntax familiar to web page writers.
Python’s standard library xml
module package includes tools for parsing XML, in both the SAX and the
DOM parsing models. In short, SAX parsers provide a subclass with
methods called during the parsing operation, and DOM parsers are given
access to an object tree representing the (usually) already parsed
document. SAX parsers are essentially state machines and must record
details as the parse progresses; DOM parsers walk object trees using
loops, attributes, and methods defined by the DOM standard.
Beyond these parsing tools, Python also ships with an xmlrpclib
to support the XML-RPC protocol
(remote procedure calls that transmit objects encoded as XML over
HTTP), as well as a standard HTML parser, htmllib
, that works on similar principles
and is based upon the sgmllib
SGML parser module. The third-party domain has even more XML-related tools; some of these are maintained separately from Python to allow for more flexible ...
Get Programming Python, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.