Matching Shell Globs as Regular Expressions

Problem

You want to allow users to specify matches using traditional shell wildcards, not full Perl regular expressions. Wildcards are easier to type than full regular expressions for simple cases.

Solution

Use the following subroutine to convert four shell wildcard characters into their equivalent regular expression; all other characters will be quoted to render them literals.

sub glob2pat {
    my $globstr = shift;
    my %patmap = (
        '*' => '.*',
        '?' => '.',
        '[' => '[',
        ']' => ']',
    );
    $globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;
    return '^' . $globstr . '$';
}

Discussion

A Perl pattern is not the same as a shell wildcard pattern. The shell’s *.* is not a valid regular expression. Its meaning as a pattern would be /^.*\..*$/, which is admittedly much less fun to type.

The function given in the Solution makes these conversions for you, following the standard wildcard rules used by the glob built-in.

Shell

Perl

list.?

^list\..$

project.*

^project\..*$

*old

^.*old$

type*.[ch]

^type.*\.[ch]$

*.*

^.*\..*$

*

^.*$

In the shell, the rules are different. The entire pattern is implicitly anchored at the ends. A question mark maps into any character, an asterisk is any amount of anything, and brackets are character ranges. Everything else is normal.

Most shells do more than simple one-directory globbing. For instance, you can say */* to mean “all the files in all the subdirectories of the current directory.” Furthermore, ...

Get Perl Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.