Finding Files with File::Find
The
first step in building our first link
checker is to figure out a way for our script to get a list of all
the HTML files on our site. Back in Chapter 4, we
fed our script a list of filenames on the command line using the
shell’s ability to expand wildcard characters. Now, though,
we’re going to take a different approach, by using the standard
File::Find
module. We use it by putting
use File::Find
into our script, then invoking the
module’s find
function. This will make it
easy to construct a script that processes all the files under a given
starting directory, including those in deeper subdirectories.
We’ll start with the simple demonstration script,
find_files.plx
, shown in Example 11-1.
(Like all the examples in this book, you can download it from the
book’s web site, at http://www.elanus.net/book/.)
Example 11-1. find_files.plx
#!/usr/bin/perl -w # find_files.plx # this script demonstrates the use of the File::Find module. use strict; use File::Find; my $start_dir = shift or die "Usage: $0 <start_dir>\n"; unless (-d $start_dir) { die "Start directory '$start_dir' is not a directory.\n"; } find(\&process, $start_dir); sub process { # this is invoked by File::Find's find function for each # file it recursively finds. print "Found $File::Find::name\n"; }
Most of this script should look pretty straightforward at this point.
It starts off by shifting off the first item in
@ARGV
(that is, the first argument supplied to the script when it was invoked ...
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.