O'Reilly logo

Perl Cookbook by Nathan Torkington, Tom Christiansen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Preprocessing Input

Problem

You’d like your programs to work on files with funny formats, such as compressed files or remote web documents specified with a URL, but your program only knows how to access regular text in local files.

Solution

Take advantage of Perl’s easy pipe handling by changing your input files’ names to pipes before opening them.

To autoprocess gzipped or compressed files by decompressing them with gzip, use:

@ARGV = map { /\.(gz|Z)$/ ? "gzip -dc $_ |" : $_  } @ARGV;
while (<>) {
    # .......
}

To fetch URLs before processing them, use the GET program from LWP (see Chapter 20):

@ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;
while (<>) {
    # .......
}

You might prefer to fetch just the text, of course, not the HTML. That just means using a different command, perhaps lynx -dump.

Discussion

As shown in Section 16.1, Perl’s built-in open function is magical: you don’t have to do anything special to get Perl to open a pipe instead of a file. (That’s why it’s sometimes called magic open and, when applied to implicit ARGV processing, magic ARGV.) If it looks like a pipe, Perl will open it like a pipe. We take advantage of this by rewriting certain filenames to include a decompression or other preprocessing stage. For example, the file "09tails.gz" becomes "gzcat -dc 09tails.gz|".

This technique has further applications. Suppose you wanted to read /etc/passwd if the machine isn’t using NIS, and the output of ypcat passwd if it is. You’d use the output of the domainname program ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required