Chapter 13. Regexes and Grammars

Regular expressions or regexes were introduced in “Regular Expressions (Regexes)” and “Substitutions”. You might want to review those sections before reading this chapter if you don’t remember much about regexes. You don’t need to remember the details of everything we covered earlier and we will explain again briefly specific parts of the functionality that we will be using, but you are expected to understand generally how regexes work.

A Brief Refresher

Regexes, as we have studied them so far, are about string exploration using patterns. A pattern is a sequence of (often special) characters that is supposed to describe a string or part of a string. A pattern matches a string if a correspondence can be found between the pattern and the string.

For example, the following code snippet searches the string for the letter “a”, followed by any number (but at least one) of letters “b” or “c”, followed by zero or more digits followed by a “B” or a “C”:

my $str = "foo13abbccbcbcbb42Cbar";
say ~$/ if $str ~~ /a <[bc]>+ (\d*) [B|C]/;  # -> abbccbcbcbb42C
say ~$0;                                     # -> 42

This code uses the ~~ smart match operator to check whether the $str string matches the /a <[bc]>+ (\d*) [B|C]/ pattern. Remember that spaces are usually not significant in a regex pattern (unless specified otherwise).

The pattern is made of the following components:

a

A literal match of letter “a”.

<[bc]>+

The <[bc]> is a character class meaning letter “b” or “c”; the + quantifier says characters ...

Get Think Perl 6 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.