November 2002
Intermediate to advanced
640 pages
16h 33m
English
You want to do calculations based on the information in your web server’s access log file.
Open the file and parse each line with a regular expression that matches the log file format. This regular expression matches the NCSA Combined Log Format:
$pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) (\[[^\]]+\]) "(.*) (.*) (.*)" ([0-9\-]+)
([0-9\-]+) "(.*)" "(.*)"$/';This program parses the NCSA Combined Log Format lines and displays a list of pages sorted by the number of requests for each page:
$log_file = '/usr/local/apache/logs/access.log'; $pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) (\[[^\]]+\]) "(.*) (.*) (.*)" ([0-9\-]+) ([0-9\-]+) "(.*)" "(.*)"$/'; $fh = fopen($log_file,'r') or die($php_errormsg); $i = 1; $requests = array(); while (! feof($fh)) { // read each line and trim off leading/trailing whitespace if ($s = trim(fgets($fh,16384))) { // match the line to the pattern if (preg_match($pattern,$s,$matches)) { /* put each part of the match in an appropriately-named * variable */ list($whole_match,$remote_host,$logname,$user,$time, $method,$request,$protocol,$status,$bytes,$referer, $user_agent) = $matches; // keep track of the count of each request $requests[$request]++; } else { // complain if the line didn't match the pattern error_log("Can't parse line $i: $s"); } } $i++; } fclose($fh) or die($php_errormsg); // sort the array (in reverse) by number of requests arsort($requests); // print formatted results foreach ($requests ...Read now
Unlock full access