Matching Shell Globs as Regular Expressions

Problem

You want to allow users to specify matches using traditional shell wildcards, not full Perl regular expressions. Wildcards are easier to type than full regular expressions for simple cases.

Solution

Use the following subroutine to convert four shell wildcard characters into their equivalent regular expression; all other characters will be quoted to render them literals.

sub glob2pat {
    my $globstr = shift;
    my %patmap = (
        '*' => '.*',
        '?' => '.',
        '[' => '[',
        ']' => ']',
    );
    $globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;
    return '^' . $globstr . '$';
}

Discussion

A Perl pattern is not the same as a shell wildcard pattern. The shell’s *.* is not a valid regular expression. Its meaning as a pattern would be /^.*\..*$/, which is admittedly much less fun to type.

The function given in the Solution makes these conversions for you, following the standard wildcard rules used by the glob built-in.

Shell

Perl

list.?

^list\..$

project.*

^project\..*$

*old

^.*old$

type*.[ch]

^type.*\.[ch]$

*.*

^.*\..*$

*

^.*$

In the shell, the rules are different. The entire pattern is implicitly anchored at the ends. A question mark maps into any character, an asterisk is any amount of anything, and brackets are character ranges. Everything else is normal.

Most shells do more than simple one-directory globbing. For instance, you can say */* to mean “all the files in all the subdirectories of the current directory.” Furthermore, ...

Get Perl Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.