Skip to Content
Perl for Web Site Management
book

Perl for Web Site Management

by John Callender
October 2001
Beginner
528 pages
15h 20m
English
O'Reilly Media, Inc.
Content preview from Perl for Web Site Management

Putting It All Together

Let’s take stock of what we’ve done so far. We’ve written a script that will descend recursively through a filesystem, reading in the contents of any HTML files it encounters and extracting all the <A HREF="..."> and <IMG SRC="..."> attributes from those files. We’ve also created a subroutine that will take a directory name and a list of links extracted from a file in that directory, identify which links point to local files, and convert them to full (that is, absolute) filesystem pathnames.

The fast-but-stupid version of our link-checker is almost finished. The main thing left is defining the data structure that will hold the information on the bad links it discovers.

For that, we go back to the top of the script, just below the configuration section, and add the following:

my %bad_links;    # A "hash of arrays" with keys consisting of URLs
                  # under $start_base, and values consisting of lists 
                  # of bad links on those pages.

my %good;         # A hash mapping filesystem paths to
                  # 0 or 1 (for good or bad). Used to cache the results
                  # of previous checks so they needn't be repeated for
                  # subsequent pages.

Here we’ve declared two new hashes that are going to be used in our script: %bad_links and %good . %good is fairly straightforward; we’re going to use it to store the result of testing the links our script processes. The keys of the %good hash are the local filesystem paths for the files we are checking (e.g., /w1/s/socalsail/index.html). A link that turns out to be bad ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Web Client Programming with Perl

Web Client Programming with Perl

Clinton Wong
Embedding Perl in HTML with Mason

Embedding Perl in HTML with Mason

Ken Williams, Dave Rolsky

Publisher Resources

ISBN: 1565926471Catalog PageErrata