The Log-Analysis Script
Now that the hostname lookups are taken care of, it’s time to write the log-analysis script. Example 8-2 shows the first version of that script.
Example 8-2. log_report.plx, a web log-analysis script (first version)
#!/usr/bin/perl -w
# log_report.plx
# report on web visitors
use strict;
while (<>) {
my ($host, $ident_user, $auth_user, $date, $time,
$time_zone, $method, $url, $protocol, $status, $bytes) =
/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] "(\S+) (.+?)
(\S+)" (\S+) (\S+)$/;
print join "\n", $host, $ident_user, $auth_user, $date, $time,
$time_zone, $method, $url, $protocol, $status,
$bytes, "\n";
}This first version of the script is simple. All it does is read in
lines via the <>
operator, parse those lines into their
component pieces, and then print out the parsed elements for
debugging purposes. The line that does the printing out is
interesting, in that it uses Perl’s
join
function, which you haven’t seen
before. The join function is the polar opposite,
so to speak, of the split function: it lets you
specify a string (in its first argument) that will be used to join
the list comprising the rest of its arguments into a scalar. In other
words, the Perl expression join '-', 'a', 'b', 'c'
would return the string a-b-c. And in this case,
using \n to join the various elements parsed by
our script lets us print out a newline-separated list of those parsed
items.
The Mammoth Regular Expression
The real juicy part of this script, though, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access