Representing Individual Characters
A single character can be used to represent itself in a regular expression. In this case, it is known as a normal character. For example, the regular expression d
matches the letter d
, and def
matches the string def
, as you might expect. Each of the three single characters (d
, e
, and f
) is its own atom, and it can have a quantifier associated with it. For example, the regular expression d+ef
matches the strings def
, ddef
, dddef
, etc.
Certain characters, in order to be taken literally, must be escaped because they have another meaning in a regular expression. For example, the asterisk (*) will be treated like a quantifier unless it is escaped. These characters, called metacharacters, must be escaped (except when they are within square brackets): ., \, ?, *, +, |, ^, $, {, }, (, ), [, and ].
These characters are escaped by preceding them with a backslash. This is referred to as a single-character escape because there is only one matching character. For convenience, there are three additional single-character escapes for the whitespace characters tab, line feed, and carriage return. Table 18-4 lists the single-character escapes.
Get XQuery now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.