By Sal Mangano
Book Price: $49.95 USD
£35.50 GBP
PDF Price: $39.99
Cover | Table of Contents
(: comment :) but users of
XPath/XSLT 1.0 should be aware that these comments are not legal
syntax. When we are showing the result of an XPath evaluation that is
empty, we will write (), which happens to be the
way one writes a literal empty sequence in XPath 2.0.(: comment :) but users of
XPath/XSLT 1.0 should be aware that these comments are not legal
syntax. When we are showing the result of an XPath evaluation that is
empty, we will write (), which happens to be the
way one writes a literal empty sequence in XPath 2.0.child:: axis specification, but you can if you are
feeling pedantic. One can reach deeper into the XML tree using the
descendant:: and the
descendant-or-self:: axes. The former excludes the
context node and the latter includes it.<Test id="descendants"> <parent> <X id="1"/> <X id="2"/> <Y id="3"> <X id="3-1"/> <Y id="3-2"/> <X id="3-3"/> </Y> <X id="4"/> <Y id="5"/> <Z id="6"/> <X id="7"/> <X id="8"/> <Y id="9"/> </parent> </Test> (: Select all child elements named X :) X (: same as child::X :) Result: <X id="1"/> <X id="2"/> <X id="4"/> <X id="7"/><X id="8"/> (:Select the first X child element:) X[1] Result: <X id="1"/> (:Select the last X child element:) X[last()] Result: <X id="8"/> (:Select the first element, provided it is an X. Otherwise empty:) *[1][self::X] Result: <X id="1"/> (:Select the last child, provided it is an X. Otherwise empty:) *[last()][self::X] Result: () *[last()][self::Y] Result: <Y id="9"/> (: Select all descendants named X :) descendant::X Result: <X id="1"/> <X id="2"/> <X id="3-1"/> <X id="3-3"/> <X id="4"/> <X id="7"/> <X id="8"/> (: Select the context node, if it is an X, and all descendants named X :)
X before each
predicate, but one could equally substitute any path expression for
X, including those in Recipe 1.1.eq, ne, lt, le, gt, and
ge) instead of the operators (=, !=,
<, <=, >, and
>=). This is because when one is comparing
atomic values, the new operators are preferred. In XPath 1.0, you
only have the latter operators so make the appropriate substitution.
The new operators were introduced in XPath 2.0 because they have
simpler semantics and will probably be more efficient as a result.
The complexity of the old operators comes when one considers cases
where a sequence is on either side of the comparison. Recipe 1.8 covers this topic further.X[@a = 10] is not
the same as X[@a = '10'] when the attribute
a has an integer type. Here we assume there is no
schema and therefore all atomic values have the type
untypedAtomic. You can find more on this topic in
Recipes Recipe 1.9 and Recipe 1.10.(: Select X child elements that have an attribute named a. :) X[@a] (: Select X children that have at least one attribute. :) X[@*] (: Select X children that have at least three attributes. :) X[count(@*) > 2]
(: The empty node set :)
/..
(: The empty sequence constructor. :) () (: Sequence consisting of the single atomic item 1. :) 1 (: Use the comma operator to construct a sequence. Here we build a sequence of all X children of the context, followed by Y children, followed by Z children. :) X, Y, Z (: Use the to operator to construct ranges. :) 1 to 10 (: Here we combine comma with several ranges. :) 1 to 10, 100 to 110, 17, 19, 23 (: Variables and functions can be used as well. :) 1 to $x 1 to count(para) (: Sequences do not nest so the following two sequences are the same. :) ((1,2,3), (4,5, (6,7), 8, 9, 10)) 1,2,3,4,5,6,7,8,9,10 (: The to operator cannot create a decreasing sequence directly. :) 10 to 1 (: This sequence is empty! :) (: You can accomplish the intended effect with the following. :) for $n in 1 to 10 return 11 - $n (: Remove duplicates from a sequence. :) distinct-values($seq) (: Return the size of a sequence. :) count($seq) (: Test if a sequence is empty. :) empty($seq) (: prefer over count($seq) eq 0 :) (: Locate the positions of an item in a sequence. Index-of produces a sequence of integers for every item in the first arg that is eq to the second. :) index-of($seq, $item) (: Extract subsequences. :) (: Up to 3 items from $seq, starting with the second. :) subsequence($seq, 2, 3) (: All items from $seq at position 3 to the end of the sequence. :) subsequence($seq, 3) (: Insert a sequence, $seq2, before the 3rd item in an input sequence, $seq1. :)
xsl:choose in
simple situations. These tricks rely on the fact that false converts
to 0 and true to 1 when used in a mathematical context.(: min :) ($x <= $y) * $x + ($y < $x) * $y (: max :) ($x >= $y) * $x + ($y > $x) * $y (: abs :) (1 - 2 * ($x < 0)) * $x
if
expression.(: Default the value of a missing attribute to 10. :) if (@x) then @x else 10 (: Default the value of a missing element to 'unauthorized'. :) if (password) then password else 'unauthorized''unauthorized' (: Guard against division by zero. :) if ($d ne 0) then $x div $d else 0 (: A para elements text if it contains at least one non-whitespace character; otherwise, a single space. :) if (normalize-space(para)) then string(para) else ' '
xsl:if is not that bad, but if you need to express
if-then-else logic, you are now forced to use the bulkier
for
expression.
Here we show four cases demonstrating how the for
expression can map sequences of differing input and output sizes.(: Sum of squares. :) sum(for $x in $numbers return $x * $x) (: Average of squares. :) avg(for $x in $numbers return $x * $x)
(: Map a sequence of words in all paragraphs to a sequence of word lengths. :) for $x in //para/tokenize(., ' ') return string-length($x) (: Map a sequence of words in a paragraph to a sequence of word lengths for words greater than three letters. :) for $x in //para/tokenize(., ' ') return if (string-length($x) gt 3) the string-length($x) else () (: Same as above but with a condition on the input sequence. :) for $x in //para/tokenize(., ' ')[string-length(.) gt 3] return string-length($x)
(: Generate a sequence of squares of the first 100 integers. :) for $i in 1 to 100 return $i * $i (: Generate a sequence of squares in reverse order. :) for $i in 0 to 10 return (10 - $i) * (10 - $i)
(: Map a sequence of paragraphs to a duped sequence of paragraphs. :) for $x in //para return ($x, $x) (: Duplicate words. :) for $x in //para/tokenize(., ' ') return ($x, $x)
= and != operators in XPath 1.0
and 2.0 will suffice.(: True if at least one section is referenced. :) //section/@id = //ref/@idref (: True if all section elements are referenced by some ref element. :) count(//section) = count(//section[@id = //ref/@idref])
some
and
every expressions to do the same.(: True if at least one section is referenced. :) some $id in //para/@id satisfies $id = //ref/@idref (: True if all section elements are referenced by some ref element. :) every $id in //section/@id satisfies $id = //ref/@idref
(: There exists a section that references every section except itself. :)
some $s in //section satisfies
every $id in //section[@id ne $s/@id]/@id satisfies $id = $s/ref/@idref
(: $sequence2 is a sub-sequence of $sequence1 :)
count($sequence2) <= count($sequence1) and
every $pos in 1 to count($sequence1),
$item1 in $sequence1[$pos],
$item2 in $sequence2[$pos] satisfies $item1 = $item2
count($sequence1)
items in $sequence2 are the same as corresponding
items in $sequence1.(: union :) $set1 | $set2 (: intersection :) $set1[count(. | $set2) = count($set2)] (: difference :) $set1[count(. | $set2) != count($set2)]
union is added
as an alias. In addition, intersect and
except are added for intersection and set
difference respectively.$set1 union $set2 (: intersection :) $set1 intersect $set2 (: difference :) $set1 except $set2
except
operator is used in an XPath 2.0 idiom for selecting all attributes
but a given set.(: All attributes except @a. :) @* except @a (: All attributes except @a and @b. :) @* except @a, @b
@*[local-name(.) != 'a' and local-name(.) != 'b']
(: union :) distinct-values( ($items1, $items2) ) (: intersection :) distinct-values( $items1[. = $items2] ) (: difference :) distinct-values( $items1[not(. = $items2)] )
(: Test if $x and $y are the same exact node. :)
generate-id($x) = generate-id($y)
(: You can also take advantage of the the | operator's removal of duplicates. :)
count($x|$y) = 1
(: Test if $x precedes $y in document order - note that this does not work if $x
or $y are attributes. :)
count($x/preceding::node()) < count($y/preceding::node()) or
$x = $y/ancestor::node()
(: Test if $x follows $y in document order - note that this does not work if $x
or $y are attributes. :)
count($x/following::node()) < count($y/following::node()) or
$y = $x/ancestors::node()
(: Test if $x and $y are the same exact node. :) $x is $y (: Test if $x precedes $y in document order. :) $x << $y (: Test if $x follows $y in document order. :) $x >> $y
xsl::for-each-group element is preferred. See
Recipe 6.2 for examples.<xsl:stylesheet version="1.0">
<!-- ... -->
</xsl:stylesheet>
(: Convert the first X child of the context to a number. :)
number(X[1]) + 17
(: Convert a number in $n to a string. :)
concat("id-", string($n))
(: Construct a date from a string. :)
xs:date("2005-06-01")
(: Construct doubles from strings. :)
xs:double("1.1e8") + xs:double("23000")
castable as,
cast as, and treat
as. Most of the time, you want to use the first
two.if ($x castable as xs:date) then $x cast as xs:date else xs:date("1970-01-01")
treat
as, is not
a conversion per se but rather an assertion that tells the XPath
processor that you promise at runtime a value will conform to a
specified type. If this turns out not to be the case, then a type
error will occur. XPath 2.0 added treat
as so XPath implementers could perform static
(compile time) type checking in addition to dynamic type checking
while allowing programmers to selectively disable static type checks.
Static type checking XSLT 2.0 implementations will likely be rare so
you can ignore treat
as for the
time being. It is far more likely to arise in higher-end XQuery
processors that do static type checking to facilitate various
optimizations.(: Test if all invoiceDate elements have been validated as dates. :) if (order/invoiceDate instance of element(*, xs:date)) then "invoicing complete" else " invoicing incomplete"
instance
of is only
useful in the presence of schema validation. In addition,
it is not the same as castable as. For instance,
10 castable
as
xs:positiveInteger is always true but
10
instance
of
xs:positiveInteger is never
true because literal integer types are labeled as
xs:decimal.instance of but rather from the safety
and convenience of knowing that there will be no type error surprises
once validation is passed. This can lead to more concise stylesheets.(: Without validation, you should code like this. :) for $order in Order return xs:date($order/invoiceDate) - xs:date($order/createDate) (: If you know all date elements have been validated, you can dispense with the xs:date constructor. for $order in Order return $order/invoiceDate - $order/createDate
substring($value, (string-length($value) - string-length($substr)) + 1) = $substr
ends-with($value, $substr)
starts-with()
function but no ends-with()
. This is rectified in 2.0. However, as the
previous 1.0 code shows, ends-with can be
implemented easily in terms of substring()
and string-length()
. The code simply extracts the last
string-length($substr) characters from the target
string and compares them to the substring.<xsl:template name="string-index-of">
<xsl:param name="input"/>
<xsl:param name="substr"/>
<xsl:choose>
<xsl:when test="contains($input, $substr)">
<xsl:value-of select="string-length(substring-before($input, $substr))+1"/>
</xsl:when>
<xsl:otherwise>0</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:function name="ckbk:string-index-of">
<xsl:param name="input"/>
<xsl:param name="substr"/>
<xsl:sequence select="if (contains($input, $substr))
then string-length(substring-before($input, $substr))+1
else 0"/>
</xsl:function>
string-length(substring-before($value,
$substr))+1. However, in general, you need a way
to handle the case in which the substring is not present. Here, zero
is chosen as an indication of this case, but you can use another
value such as -1 or NaN.translate with an empty replace string. For
example, the following code can strip whitespace from a string:translate($input," 	
&xd;", "")
translate() is still a good idea in XSLT
2.0 because it will usually perform best. However, some string
removal tasks are much more naturally implemented using regular
expressions and the new replace() function:(: \s matches all whitespace characters :) replace($input,"\s","")
translate()
is a versatile string function that is often
used to compensate for missing string-processing capabilities in XSLT
1.0. Here you use the fact that translate() will
not copy characters in the input string that are in the
from string but do not have a corresponding
character in the to string.translate to remove all but a
specific set of characters from a string. For example, the following
code removes all non-numeric characters from a string:translate($string,
translate($string,'0123456789',''),'')
translate() removes all characters of
interest (e.g., numbers) to obtain a from string
for the outer translate(), which removes these
non-numeric characters from the original string.normalize-space(
),
which does just that. If you ever
needed to normalize based on characters other than spaces, then you
might use the following code (where C is the
character you want to normalize):substr. Using this technique, you can create a
substring-before-last and a
substring-after-last:<xsl:template name="substring-before-last">
<xsl:param name="input" />
<xsl:param name="substr" />
<xsl:if test="$substr and contains($input, $substr)">
<xsl:variable name="temp" select="substring-after($input, $substr)" />
<xsl:value-of select="substring-before($input, $substr)" />
<xsl:if test="contains($temp, $substr)">
<xsl:value-of select="$substr" />
<xsl:call-template name="substring-before-last">
<xsl:with-param name="input" select="$temp" />
<xsl:with-param name="substr" select="$substr" />
</xsl:call-template>
</xsl:if>
</xsl:if>
</xsl:template>
<xsl:template name="substring-after-last">
<xsl:param name="input"/>
<xsl:param name="substr"/>
<!-- Extract the string which comes after the first occurrence -->
<xsl:variable name="temp" select="substring-after($input,$substr)"/>
<xsl:choose>
<!-- If it still contains the search string the recursively process -->
<xsl:when test="$substr and contains($temp,$substr)">
<xsl:call-template name="substring-after-last">
<xsl:with-param name="input" select="$temp"/>
<xsl:with-param name="substr" select="$substr"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$temp"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
substring-before/after, but one can get the
desired effect using the versatile tokenize(
)
function that uses regular expressions:N times, where
N is a parameter. For example, you might need to
pad out a string with spaces to achieve alignment.$count is odd:<xsl:template name="dup">
<xsl:param name="input"/>
<xsl:param name="count" select="2"/>
<xsl:choose>
<xsl:when test="not($count) or not($input)"/>
<xsl:when test="$count = 1">
<xsl:value-of select="$input"/>
</xsl:when>
<xsl:otherwise>
<!-- If $count is odd append an extra copy of input -->
<xsl:if test="$count mod 2">
<xsl:value-of select="$input"/>
</xsl:if>
<!-- Recursively apply template after doubling input and
halving count -->
<xsl:call-template name="dup">
<xsl:with-param name="input"
select="concat($input,$input)"/>
<xsl:with-param name="count"
select="floor($count div 2)"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
for
expression. We overload dup to replicate the
behavior of the defaulted argument in the XSLT 1.0 implementation:<xsl:function name="ckbk:dup">
<xsl:param name="input" as="xs:string"/>
<xsl:sequence select="ckbk:dup($input,2)"/>
</xsl:function>
<xsl:function name="ckbk:dup">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="count" as="xs:integer"/>
<xsl:sequence select="string-join(for $i in 1 to $count return $input,'')"/>
</xsl:function>
$input in a subtle yet
effective way:<xsl:template name="reverse">
<xsl:param name="input"/>
<xsl:variable name="len" select="string-length($input)"/>
<xsl:choose>
<!-- Strings of length less than 2 are trivial to reverse -->
<xsl:when test="$len < 2">
<xsl:value-of select="$input"/>
</xsl:when>
<!-- Strings of length 2 are also trivial to reverse -->
<xsl:when test="$len = 2">
<xsl:value-of select="substring($input,2,1)"/>
<xsl:value-of select="substring($input,1,1)"/>
</xsl:when>
<xsl:otherwise>
<!-- Swap the recursive application of this template to
the first half and second half of input -->
<xsl:variable name="mid" select="floor($len div 2)"/>
<xsl:call-template name="reverse">
<xsl:with-param name="input"
select="substring($input,$mid+1,$mid+1)"/>
</xsl:call-template>
<xsl:call-template name="reverse">
<xsl:with-param name="input"
select="substring($input,1,$mid)"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:function name="ckbk:reverse">
<xsl:param name="input" as="xs:string"/>
<xsl:sequence select="codepoints-to-string(
reverse(string-to-codepoints($input)))"/>
</xsl:function>
<xsl:template name="search-and-replace">
<xsl:param name="input"/>
<xsl:param name="search-string"/>
<xsl:param name="replace-string"/>
<xsl:choose>
<!-- See if the input contains the search string -->
<xsl:when test="$search-string and
contains($input,$search-string)">
<!-- If so, then concatenate the substring before the search
string to the replacement string and to the result of
recursively applying this template to the remaining substring.
-->
<xsl:value-of
select="substring-before($input,$search-string)"/>
<xsl:value-of select="$replace-string"/>
<xsl:call-template name="search-and-replace">
<xsl:with-param name="input"
select="substring-after($input,$search-string)"/>
<xsl:with-param name="search-string"
select="$search-string"/>
<xsl:with-param name="replace-string"
select="$replace-string"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<!-- There are no more occurrences of the search string so
just return the current input string -->
<xsl:value-of select="$input"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
translate()
function. This code, for example, converts
from upper- to lowercase:translate($input,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')
translate($input, 'abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ')
upper-case()
and lower-case()
:upper-case($input) lower-case($input)
ß (eszett) is converted to an
uppercase SS. Many modern programming languages
provide case-conversion functions that are sensitive to locale, but
XSLT does not support this concept directly. This is unfortunate,
considering that XSLT has other features supporting
internationalization.<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE stylesheet [
<!ENTITY UPPERCASE "ABCDEFGHIJKLMNOPQRSTUVWXYZ">
<!ENTITY LOWERCASE "abcdefghijklmnopqrstuvwxyz">
<!ENTITY UPPER_TO_LOWER " '&UPPERCASE;' , '&LOWERCASE;' ">
<!ENTITY LOWER_TO_UPPER " '&LOWERCASE;' , '&UPPERCASE;' ">
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="test"
select=" 'The rain in Spain falls mainly on the plain' "/>
<output>
<lowercase>
<xsl:value-of
select="translate($test,&UPPER_TO_LOWER;)"/>
</lowercase>
<uppercase>
<xsl:value-of
select="translate($test,&LOWER_TO_UPPER;)"/>
</uppercase>
</output>
</xsl:template>
</xsl:stylesheet>token element text. It also defaults to
character-level tokenization if the delimiter string is empty:<xsl:template name="tokenize">
<xsl:param name="string" select="''" />
<xsl:param name="delimiters" select="' 	
'" />
<xsl:choose>
<!-- Nothing to do if empty string -->
<xsl:when test="not($string)" />
<!-- No delimiters signals character level tokenization. -->
<xsl:when test="not($delimiters)">
<xsl:call-template name="_tokenize-characters">
<xsl:with-param name="string" select="$string" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="_tokenize-delimiters">
<xsl:with-param name="string" select="$string" />
<xsl:with-param name="delimiters" select="$delimiters" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="_tokenize-characters">
<xsl:param name="string" />
<xsl:if test="$string">
<token><xsl:value-of select="substring($string, 1, 1)" /></token>
<xsl:call-template name="_tokenize-characters">
<xsl:with-param name="string" select="substring($string, 2)" />
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template name="_tokenize-delimiters">
<xsl:param name="string" />
<xsl:param name="delimiters" />
<xsl:param name="last-delimit"/>
<!-- Extract a delimiter -->
<xsl:variable name="delimiter" select="substring($delimiters, 1, 1)" />
<xsl:choose>
<!-- If the delimiter is empty we have a token -->
<xsl:when test="not($delimiter)">
<token><xsl:value-of select="$string"/></token>
</xsl:when>
<!-- If the string contains at least one delimiter we must split it -->
<xsl:when test="contains($string, $delimiter)">
<!-- If it starts with the delimiter we don't need to handle the -->
<!-- before part -->
<xsl:if test="not(starts-with($string, $delimiter))">
<!-- Handle the part that comes before the current delimiter -->
<!-- with the next delimiter. If there is no next the first test -->
<!-- in this template will detect the token -->
<xsl:call-template name="_tokenize-delimiters">
<xsl:with-param name="string"
select="substring-before($string, $delimiter)" />
<xsl:with-param name="delimiters"
select="substring($delimiters, 2)" />
</xsl:call-template>
</xsl:if>
<!-- Handle the part that comes after the delimiter using the -->
<!-- current delimiter -->
<xsl:call-template name="_tokenize-delimiters">
<xsl:with-param name="string"
select="substring-after($string, $delimiter)" />
<xsl:with-param name="delimiters" select="$delimiters" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<!-- No occurrences of current delimiter so move on to next -->
<xsl:call-template name="_tokenize-delimiters">
<xsl:with-param name="string"
select="$string" />
<xsl:with-param name="delimiters"
select="substring($delimiters, 2)" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>