Character Classes
In a pattern match, you may match any character that
has--or that does not have--a particular property. There are four ways
to specify character classes. You may specify a character classes in
the traditional way using square brackets and enumerating the possible
characters, or you may use any of three mnemonic shortcuts: the
classic Perl classes, the new Perl Unicode properties, or the standard
POSIX classes. Each of these shortcuts matches only one character from
its set. Quantify them to match larger expanses, such as
\d+
to match one or more digits. (An easy mistake
is to think that \w
matches a word. Use
\w+
to match a word.)
Custom Character Classes
An enumerated list of characters in square brackets
is called a character class and matches any one
of the characters in the list. For example,
[aeiouy]
matches a letter that can be a vowel in
English. (For Welsh add a "w
", for Scottish an
"r
".) To match a right square bracket, either
backslash it or place it first in the list.
Character ranges may be indicated using a hyphen and
the a-z
notation. Multiple ranges may be
combined; for example, [0-9a-fA-F]
matches one
hex "digit". You may use a backslash to protect a hyphen that would
otherwise be interpreted as a range delimiter, or just put it at the
beginning or end of the class (a practice which is arguably less
readable but more traditional).
A caret (or circumflex, or hat, or up arrow) at the front of the character class inverts the class, causing ...
Get Programming Perl, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.