Chapter 10. Generating a Web Access Report

This chapter completes the example begun in Chapter 8 and Chapter 9. We’ve reached the final stage in creating our log analysis script, where we store the information about the individual “visitors” whose activities we are attempting to reconstruct, and use that data to print out the actual report. In describing these final enhancements to the log_report.plx script, we will look first at the &new_visit and &add_to_visit routines used to store our visit data. Then we will learn about a very useful pair of functions for producing formatted output: printf and sprintf. We’ll talk about how to produce our report, and then how to embellish it with information about the site’s more popular pages, as well as information regarding the referral strings and user-agent data available from combined-format logs. Finally, we’ll talk about how to make the script email its report to our email address, and how to schedule it to run at periodic intervals using the Unix cron facility.

The &new_visit and &add_to_visit Subroutines

In Chapter 8 we looked at the log_report.plx script’s &store_line subroutine, which served as a “traffic cop,” directing the data from each line to either the &new_visit or &add_to_visit subroutines. Now let’s take a look at those two subroutines in detail.

We’ll start with the &new_visit subroutine, which we use to store information about a visit that has just started (either because the current log file line relates to an entirely ...

Get Perl for Web Site Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.