Removing Specific Characters from a String

Problem

You want to strip certain characters (e.g., whitespace) from a string.

Solution

Use translate with an empty replace string. For example, the following code can strip whitespace from a string:

translate($input," 	
&xd;", "")

Discussion

translate( ) is a versatile string function that is often used to compensate for missing string-processing capabilities in XSLT. Here you use the fact that translate( ) will not copy characters in the input string that are in the from string but do not have a corresponding character in the to string.

You can also use translate to remove all but a specific set of characters from a string. For example, the following code removes all non-numeric characters from a string:

translate($string, 
          translate($string,'0123456789',''),'')

The inner translate( ) removes all characters of interest (e.g., numbers) to obtain a from string for the outer translate( ), which removes these non-numeric characters from the original string.

Sometimes you do not want to remove all occurrences of whitespace, but instead want to remove leading, trailing, and redundant internal whitespace. XPath has a built-in function, normalize-space( ), which does just that. If you ever needed to normalize based on characters other than spaces, then you might use the following code (where C is the character you want to normalize):

translate(normalize-space(translate($input,"C "," C")),"C "," C")

However, this transformation won’t work quite right if the input string contains whitespace characters other than spaces, i.e., tab (#x9), newline (#xA), and carriage return (#xD). The reason is that the code swaps space with the character to normalize and then normalizes the resulting spaces and swaps back. If nonspace whitespace remains after the first transformation, it will also be normalized, which might not be what you want. Then again, the applications of nonwhitespace normalizing are probably rare anyway. Here you use this technique to remove extra - characters.

<xsl:template match="/">
  <xsl:variable name="input" 
       select=" '---this --is-- the way we normalize non-whitespace---' "/>
 <xsl:value-of 
      select="translate(normalize-space(
                                 translate($input,'- ',' -')),'- ',' -')"/>
</xsl:template>

The result is:

this -is- the way we normalize non-whitespace

Get XSLT Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.