Chapter 23. Structured Text: XML

XML, the eXtensible Markup Language, is a widely used data interchange format. On top of XML itself, the XML community (in good part within the World Wide Web Consortium [W3C]) has standardized many other technologies, such as schema languages, namespaces, XPath, XLink, XPointer, and XSLT.

Industry consortia have defined industry-specific markup languages on top of XML for data exchange among applications in their respective fields. XML, XML-based markup languages, and other XML-related technologies are often used for inter-application, cross-language, cross-platform data interchange in specific industries.

Python’s standard library, for historical reasons, has multiple modules supporting XML under the xml package, with overlapping functionality; this book does not cover them all, so see the online documentation.

This book (and, specifically, this chapter) covers only the most Pythonic approach to XML processing: ElementTree, whose elegance, speed, generality, multiple implementations, and Pythonic architecture make it the package of choice for Python XML applications. For complete tutorials and all details on the xml.etree.ElementTree module, see the online docs and the website of ElementTree’s creator, Fredrik Lundh, best known as “the effbot.”1

This book takes for granted some elementary knowledge of XML itself; if you need to learn more about XML, we recommend the book XML in a Nutshell (O’Reilly).

Parsing XML from untrusted sources puts your ...

Get Python in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.