[2.0] The unparsed-text() and unparsed-text-available() Functions
The last new function for combining documents is the
unparsed-text()
function. This
lets you read in text from a URL. That text is not parsed, letting you
read in text documents,
comma-separated values, or even HTML documents that aren’t well-formed
XML. What’s more, you can combine unparsed-text()
with other new features
such as the tokenize()
function
or the <xsl:analyze-string>
element to process that text and transform it in a useful way.
As an example, we’ll read in a file of comma-separated values and output them as an HTML table of addresses. Here’s the comma-separated file, unparsed-text.csv:
Mr.,Chester Hasbrouck,Frisby,1234 Main Street,Sheboygan,WI,48392 Ms.,Natalie,Attired,707 Breitling Way,Winter Harbor,ME,00218 Ms.,Amanda,Reckonwith,930-A Chestnut Street,Lynn,MA,02930 Mrs.,Mary,Backstayge,283 First Avenue,Skunk Haven,MA,02718
We’ll go through three simple steps to process this data. First,
we’ll use the tokenize()
function to get each line of the file. Next, we’ll use tokenize()
to get each comma-separated
value. Finally, we’ll take each value and transform it appropriately.
Using the comma-separated file we’ve listed here, the third
comma-separated value in each line is the customer’s last name, the
seventh value is the zip code, and so forth.
To process the file one line at a time, we’ll use this technique, courtesy of the XSLT 2.0 spec:
<xsl:for-each select="tokenize(unparsed-text('addresses.csv'), '\r?\n')"> ...
Get XSLT, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.