Chapter 7. XPath

When writing code to process XML, you often want to select specific parts of an XML document to process in a particular way. For example, you might want to select some invoices that fit a date range of interest. Similarly, you may want to specifically exclude some part(s) of an XML document from processing. For example, if you make basic human resources data available on your corporate intranet, you probably want to be sure not to display confidential information such as salary for an employee. To achieve those basic needs, it is essential to have an understanding of a technology that allows you to select a part or parts of an XML document to process. The XML Path Language, XPath, is designed to allow the developer to select specific parts of an XML document.

The latest incarnation of XPath to be given candidate recommendation status by the W3C is version 2.0. The specification can be viewed at www.w3.org/TR/xpath20/. Because the version is still not a recommendation and only appeared in June 2006, and is vastly larger than version 1.0, there are still only a few processors supporting it. The current champion is Saxon, which provides a Java and a .NET version and is available in free or paid for versions, the latter implementing some of the more advanced, and optional, features. You can read how to install and configure Saxon in Chapter 8, which is devoted to XSLT. XPath was designed specifically for use with Extensible Stylesheet Language Transformations (XSLT), ...

Get Beginning XML, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.