Representing Groups of Characters
Sometimes characters fall into convenient groups, such as decimal digits or punctuation characters. Three different kinds of escapes can be used to represent a group of characters: multi-character escapes, category escapes, and block escapes. Like single-character escapes, they all start with a backslash.
Multi-Character Escapes
Multi-character escapes, listed in Table 18-7, represent groups of related characters. They are called multi-character escapes because they allow a choice of multiple characters. However, each escape represents only one character in a matching string. To allow several replacement characters, you should use a quantifier such as +.
Table 18-7. Multi-character escapes
|
Escape |
Meaning |
|---|---|
|
|
A whitespace character, as defined by XML (space, tab, carriage return, or line feed) |
|
|
A character that is not a whitespace character |
|
|
A decimal digit (0 to 9), or a digit in another style, for example, an Indic Arabic digit |
|
|
A character that is not a decimal digit |
|
|
A "word" character, that is, any character not in one of the Unicode categories Punctuation, Separators, and Other |
|
|
A nonword character, that is, any character in one of the Unicode categories Punctuation, Separators, and Other |
|
|
A character that is allowed as the first character of an XML name, i.e., a letter, an underscore (_), or a colon (:); the "i" stands for "initial" |
|
|
A character that cannot be the first character ... |
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access