7.2. Simple Uses of Regular Expressions

If we were looking for all lines of a file that contain the string abc, we might use the grep command:

grep abc somefile >results

In this case, abc is the regular expression that the grep command tests against each input line. Lines that match are sent to standard output, here ending up in the file results because of the command-line redirection.

In Perl, we can speak of the string abc as a regular expression by enclosing the string in slashes:

if (/abc/) {
    print $_;
}

But what is being tested against the regular expression abc in this case? Why, it's our old friend, the $_ variable! When a regular expression is enclosed in slashes (as above), the $_ variable is tested against the regular expression. If the regular expression matches, the match operator returns true. Otherwise, it returns false.

For this example, the $_ variable is presumed to contain some text line and is printed if the line contains the characters abc in sequence anywhere within the line—similar to the grep command above. Unlike the grep command, which is operating on all of the lines of a file, this Perl fragment is looking at just one line. To work on all lines, add a loop, as in:

while (<>) {
    if (/abc/) {
        print $_;
    }
}

What if we didn't know the number of b's between the a and the c? That is, what if we want to print the line if it contains an a followed by zero or more b's, followed by a c. With grep, we'd say:

grep "ab*c" somefile >results

(The argument containing ...

Get Learning Perl, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.