Character Classes

A character class, a list of possible characters inside square brackets ([]), matches any single character from within the class. It matches just one single character, but that one character may be any of the ones listed.

For example, the character class [abcwxyz] may match any one of those seven characters. For convenience, you may specify a range of characters with a hyphen (-) so that class may also be written as [a-cw-z]. That didn’t save much typing, but it’s more usual to make a character class like [a-zA-Z] to match any one letter out of that set of 52.[] You may use the same character shortcuts as in any double-quoted string to define a character, so the class [\000-\177] matches any seven-bit ASCII character.[] Of course, a character class will be just part of a full pattern; it will never stand on its own in Perl. For example, you might see code that says something like this:

$_ = "The HAL-9000 requires authorization to continue.";
if (/HAL-[0-9]+/) {
  print "The string mentions some model of HAL computer.\n";
}

Sometimes, it’s easier to specify the characters left out, rather than the ones within the character class. A caret (^) at the start of the character class negates it. That is, [^def] will match any single character except one of those three. And [^n\-z] matches any character except for n, hyphen, or z. (Note that the hyphen is backslashed because it’s special inside a character class. But the first hyphen in /HAL-[0-9]+/ doesn’t need a backslash ...

Get Learning Perl, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.