Chapter 3. XPath: A Syntax for Describing Needles and Haystacks

XPath is a syntax used to describe parts of an XML document. With XPath, you can refer to the first <para> element, the quantity attribute of the <part-number> element, all <first-name> elements that contain the text "Joe", and many other variations. An XSLT stylesheet uses XPath expressions in the match and select attributes of various elements to indicate how a document should be transformed. In this chapter, we’ll discuss XPath in all its glory.

XPath is designed to be used inside an attribute in an XML document. The syntax is a mix of basic programming language expressions (such as $x*6) and Unix-like path expressions (such as /sonnet/author/last-name). In addition to the basic syntax, XPath provides a set of useful functions that allow you to find out various things about the document.

One important point, though: XPath works with the parsed version of your XML document. That means that some details of the original document aren’t accessible to you from XPath. For example, entity references are resolved beforehand by the XSLT processor before instructions in our stylesheet are evaluated. CDATA sections are converted to text, as well. That means we have no way of knowing if a text node in an XPath tree was in the original XML document as text, as an entity reference, or as part of a CDATA section. As you get used to thinking about your XML documents in terms of XPath expressions, this situation won’t be a ...

Get XSLT now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.