O'Reilly logo

Perl Cookbook by Nathan Torkington, Tom Christiansen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Commenting Regular Expressions

Problem

You want to make your complex regular expressions understandable and maintainable.

Solution

You have four techniques at your disposal: comments outside the pattern, comments inside the pattern with the /x modifier, comments inside the replacement part of s///, and alternate delimiters.

Discussion

The piece of sample code in Example 6.1 uses all four techniques. The initial comment describes the overall intent of the regular expression. For relatively simple patterns, this may be all that is needed. More complex patterns, as in the example, will require more documentation.

Example 6-1. resname

#!/usr/bin/perl -p
# resname - change all "foo.bar.com" style names in the input stream
# into "foo.bar.com [204.148.40.9]" (or whatever) instead

use Socket;                 # load inet_addr
s{                          #
    (                       # capture the hostname in $1
        (?:                 # these parens for grouping only
            (?! [-_]  )     # lookahead for neither underscore nor dash
            [\w-] +         # hostname component
            \.              # and the domain dot
        ) +                 # now repeat that whole thing a bunch of times
        [A-Za-z]            # next must be a letter
        [\w-] +             # now trailing domain part
    )                       # end of $1 capture
}{                          # replace with this:
    "$1 " .                 # the original bit, plus a space
           ( ($addr = gethostbyname($1))   # if we get an addr
            ? "[" . inet_ntoa($addr) . "]" #        format it
            : "[???]"                      # else mark dubious
           )
}gex;               # /g for global
                    # /e for execute
                    # /x for nice formatting

For aesthetics, the example uses alternate delimiters. When you split your match or substitution over multiple lines, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required