Chapter 8. More About Regular Expressions
In the previous chapter, we saw the beginnings of what regular expressions can do. Here we’ll see some of their other common features.
Character Classes
A character
class, a list of possible characters inside
square brackets
([]
), matches any single character from within the
class. It matches just one single character, but that one character
may be any of the ones listed.
For example, the character class [abcwxyz]
may
match any one of those seven characters. For convenience, you may
specify a range of characters with a
hyphen (-
), so
that class may also be written as [a-cw-z]
. That
didn’t save much typing, but it’s more usual to make a
character class like [a-zA-Z]
, to match any one
letter out of that set of 52.[1] You may use the same character
shortcuts as in any double-quotish string to define a character, so
the class [\000-\177]
matches any seven-bit ASCII
character.[2]
Of course, a character class will be just part of a full pattern; it will never stand on its own in Perl. For example, you might see code that says something like this:
$_ = "The HAL-9000 requires authorization to continue."; if (/HAL-[0-9]+/) { print "The string mentions some model of HAL computer.\n"; }
Sometimes, it’s easier to specify the characters left out,
rather than the ones within the character class. A caret
(”^
“) at the start of the character
class negates it. That is, [^def]
will match any
single character except one of those three. And
[^n\-z]
matches ...
Get Learning Perl, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.