The Log-Analysis Script
Now that the hostname lookups are taken care of, it’s time to write the log-analysis script. Example 8-2 shows the first version of that script.
Example 8-2. log_report.plx, a web log-analysis script (first version)
#!/usr/bin/perl -w # log_report.plx # report on web visitors use strict; while (<>) { my ($host, $ident_user, $auth_user, $date, $time, $time_zone, $method, $url, $protocol, $status, $bytes) = /^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] "(\S+) (.+?) (\S+)" (\S+) (\S+)$/; print join "\n", $host, $ident_user, $auth_user, $date, $time, $time_zone, $method, $url, $protocol, $status, $bytes, "\n"; }
This first version of the script is simple. All it does is read in
lines via the <>
operator, parse those lines into their
component pieces, and then print out the parsed elements for
debugging purposes. The line that does the printing out is
interesting, in that it uses Perl’s
join
function, which you haven’t seen
before. The join
function is the polar opposite,
so to speak, of the split
function: it lets you
specify a string (in its first argument) that will be used to join
the list comprising the rest of its arguments into a scalar. In other
words, the Perl expression join '-', 'a', 'b', 'c'
would return the string a-b-c
. And in this case,
using \n
to join the various elements parsed by
our script lets us print out a newline-separated list of those parsed
items.
The Mammoth Regular Expression
The real juicy part of this script, though, ...
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.