Chapter 1. Data Model

XSLT is a language for transforming XML documents. The input to an XSLT program (a “stylesheet”) is one or more XML documents. The output is another document, which may be XML, HTML, or text. XSLT operates on an abstraction of XML, called the XSLT data model (the XPath data model with some additions). XSLT is “closed” over this data model. In other words, its data model applies both to its input and its output. In fact, it even models the stylesheet, which is itself expressed in XML.

Tip

Unless explicitly followed by “2.0,” whenever this book speaks of “XSLT” or “XPath,” it is referring to the 1.0 versions of these languages.

Node Types

The XPath data model describes an XML document as a tree of nodes. There are seven types of nodes:

root	text
element	attribute
processing instruction	namespace
comment

In the XPath 1.0 data model, all XML documents have a single root node, which is an invisible container for the entire document. The root node is not an element.

Tip

XPath 2.0 uses the term “document node” instead of “root node.” Regardless of what it’s called, don’t confuse it with the “root element” or “document element,” which is an element: a child of the root node, or document node.

There is one element node for each element, one attribute node for each attribute (excluding namespace declarations), one comment node for each comment, and one processing instruction node for each processing instruction (PI) that occurs in an XML document. A contiguous sequence of character data, after expanding all entities and CDATA sections, is modeled as a single text node. Finally, there is a namespace node attached to each element for each namespace/prefix binding that is in scope on that element. Each element has its own unique set of namespace nodes, which always includes at least one namespace node that corresponds to the implicit mapping between the prefix "xml" and the URI "http://www.w3.org/XML/1998/namespace" (reserved for attributes such as xml:lang and xml:space).

Tip

Thus, even for a document that does not explicitly use namespaces, there will be as many namespace nodes as there are elements.

Node Properties

Table 1-1 lists four node properties and their applicability for each type of node. These properties deal with a node’s relationship to other nodes. If a table cell is grayed out, that means the property is not applicable for that node type.

Table 1-1. Node relationship properties

Node type	Parent	Children	Attributes	Namespace nodes
Root		Ordered list of 0 or more elements, PIs, comments, and text nodes
Element	Element or root	"	Unordered list of 0 or more attribute nodes	Unordered list of 1 or more namespace nodes
PI	"
Comment	"
Text	"
Attribute	Element
Namespace	"

In the XPath language, to access a node’s parent, child nodes, attributes, or namespace nodes, use the corresponding axis: parent, child, attribute, or namespace. See the section Axes in Chapter 2.

Tip

Attributes and namespace nodes are not children. An element is considered to be the parent of an attribute or namespace node, but the attribute or namespace node is not considered to be the element’s child.

The descendants of a node consist of the node’s children, its children’s children, and so on.

All nodes, regardless of their type, have a string-value and a base URI. Some types of nodes have an expanded-name, which consists of two strings: a local part and a namespace URI. Element nodes have an optional unique ID. For each of the string-typed node properties, Table 1-2 lists the node types it applies to and how its value is determined. Once again, if a table cell is grayed out, that means the property is not applicable for that node type.

Table 1-2. String-typed node properties

Node type	String-value	Expanded-name (local/URI)	Base URI	Unique ID	Unparsed entity URIs
Root	Concatenation of descendant text nodes’ string-values, in document order		URI of the document entity		A set of mappings between declared entity names and their URIs
Element	"	Local:local name URI:namespace name	URI of external entity; otherwise, base URI of root	Value of attribute declared as type `ID` in DTD (optional)
PI	Text following PI target and whitespace	Local:PI target URI:null	"
Comment	Content of comment		Base URI of parent node
Text	Character data (at least one character)		"
Attribute	Normalized attribute value	Local:local name URI:namespace name	"
Namespace	Namespace URI	Local:namespace prefix URI:null	"

The XPath language provides functions for directly accessing most of these properties. To access the string-value of a node, use the string( ) function.

Tip

It’s not usually necessary to use string( ) explicitly, thanks to XPath’s automatic conversion of data types. See the Data Type Conversions section in Chapter 5.

To access the local and namespace URI parts of a node’s expanded-name, use the local-name( ) and namespace-uri( ) functions, respectively.

The base URI property is used for resolving relative URIs in a document, and it is used by XSLT’s document( ) function and the xsl:import and xsl:include elements. XSLT/XPath 1.0 does not provide a direct way to access the base URI property.

Tip

XPath 2.0, however, includes a function, base-uri( ), for directly accessing the base URI of a given node. It also uses the xml:base attribute to determine the base URI of a node (unlike XSLT 1.0).

The unique ID property is queried by the id( ) function to retrieve elements according to their ID value. There is no function to access the unique ID property directly, but that is not normally necessary, since you can easily access an element’s attribute values using the attribute axis.

Finally, use the unparsed-entity-uri( ) function to retrieve the URI of an unparsed entity with a given name.

All of XPath and XSLT’s built-in functions are described in Chapter 5.

Get XSLT 1.0 Pocket Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

XSLT 1.0 Pocket Reference by Evan Lenz

Chapter 1. Data Model

Tip

Node Types

Tip

Tip

Node Properties

Tip

Tip

Tip

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly