Chapter 8. Beyond Trees: XPath, XSLT, and More

In the last chapter, we introduced the concepts behind handling XML documents as memory trees. Our use of them was kind of primitive, limited to building, traversing, and modifying pieces of trees. This is okay for small, uncomplicated documents and tasks, but serious XML processing requires beefier tools. In this chapter, we examine ways to make tree processing easier, faster, and more efficient.

Tree Climbers

The first in our lineup of power tools is the tree climber. As the name suggests, it climbs a tree for you, finding the nodes in the order you want them, making your code simpler and more focused on per-node processing. Using a tree climber is like having a trained monkey climb up a tree to get you coconuts so you don’t have to scrape your own skin on the bark to get them; all you have to do is drill a hole in the shell and pop in a straw.

The simplest kind of tree climber is an iterator (sometimes called a walker ). It can move forward or backward in a tree, doling out node references as you tell it to move. The notion of moving forward in a tree involves matching the order of nodes as they would appear in the text representation of the document. The exact algorithm for iterating forward is this:

  • If there’s no current node, start at the root node.

  • If the current node has children, move to the first child.

  • Otherwise, if the current node has a following sibling, move to it.

  • If none of these options work, go back up the list of ...

Get Perl and XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.