Skip to Main Content
Programming Python, Second Edition
book

Programming Python, Second Edition

by Mark Lutz
March 2001
Intermediate to advanced content levelIntermediate to advanced
1296 pages
38h 8m
English
O'Reilly Media, Inc.
Content preview from Programming Python, Second Edition

XML Processing Tools

Python ships with XML parsing support in its standard library and plays host to a vigorous XML special-interest group. XML (eXtended Markup Language) is a tag-based markup language for describing many kinds of structured data. Among other things, it has been adopted in roles such as a standard database and Internet content representation by many companies. As an object-oriented scripting language, Python mixes remarkably well with XML’s core notion of structured document interchange, and promises to be a major player in the XML arena.

XML is based upon a tag syntax familiar to web page writers. Python’s xmllib library module includes tools for parsing XML. In short, this XML parser is used by defining a subclass of an XMLParser Python class, with methods that serve as callbacks to be invoked as various XML structures are detected. Text analysis is largely automated by the library module. This module’s source code, file xmllib.py in the Python library, includes self-test code near the bottom that gives additional usage details. Python also ships with a standard HTML parser, htmllib, that works on similar principles and is based upon the sgmllib SGML parser module.

Unfortunately, Python’s XML support is still evolving, and describing it is well beyond the scope of this book. Rather than going into further details here, I will instead point you to sources for more information:

Standard library

First off, be sure to consult the Python library manual for more ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Programming Python, 3rd Edition

Programming Python, 3rd Edition

Mark Lutz

Publisher Resources

ISBN: 0596000855Supplemental ContentCatalog PageErrata