Chapter 5. XML to Text

Text processing has made it possible to right-justify any idea, even one which cannot be justified on any other grounds.

J. Finegan

In the age of the Internet, formats such as HTML, XHTML, XML, and PDF clearly dominate the application of XSL and XSLT on the output side. However, plain old text will never become obsolete because it is the lowest common denominator in both human- and machine-readable formats. XML is often converted to text for import into another application that does not know how to read XML or does not interpret it the way you prefer. Text output is also used when the result will be sent to a terminal or post-processed in, for example, a Unix pipeline.

Many examples in this section focus on XSLT techniques that create generic XML-to-text converters. Here, generic means that the transformation can be customized easily to work on many different XML inputs or produce a variety of outputs, or both. The techniques employed in these examples have application beyond the specifics of a given recipe and often beyond the domain of text processing. In particular, you may want to look at Recipe 5.2 through Recipe 5.5, even if they do not address a present need.

Of all the output formats supported by xsl:output, text is the one for which managing whitespace is the most crucial. For this reason, this chapter addresses the issue separately in Recipe 5.1. Developers inexperienced in XML and XSLT are often vexed by what seems fickle treatment of whitespace. ...

Get XSLT Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.