Metacharacters and Metasymbols
Now that we’ve admired all the fancy cages, we can go back to looking at the critters in the cages—those funny-looking symbols you put inside the patterns. By now you’ll have cottoned to the fact that these symbols aren’t regular Perl code like function calls or arithmetic operators. Regular expressions are their own little language nestled inside of Perl. (There’s a bit of the jungle in all of us.)
For all their power and expressivity, patterns in Perl recognize the same 12 traditional metacharacters (the Dirty Dozen, as it were) found in many other regular expression packages:
\ | ( ) [ { ^ $ * + ? .Some of those bend the rules, making otherwise normal characters that follow them special. We don’t like to call the longer sequences “characters”, so when they make longer sequences, we call them metasymbols (or sometimes just “symbols”). But at the top level, those 12 metacharacters are all you (and Perl) need to think about. Everything else proceeds from there.
Some simple metacharacters stand by themselves, like . and ^ and
$. They don’t directly affect anything
around them. Some metacharacters work like prefix operators, governing
what follows them, like \. Others work
like postfix operators, governing what immediately precedes them, like
*, +, and ?. One
metacharacter, |, acts like an infix operator, standing between the operands it governs. There are even bracketing metacharacters that work like circumfix operators, governing something contained inside ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access