The anatomy of an Apache log file

Before we create the regular expression that will match a line of the Apache file, we need to understand what kind of information it holds.

Let's take a look at a line from access.log:

127.0.0.1 - jan [30/Jun/2004:22:20:17 +0200] "GET /cgi-bin/trac.cgi/login HTTP/1.1" 302 4370 "http://saturn.solar_system/cgi-bin/trac.cgi" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040620 Galeon/1.3.15"

The Apache access log that we are reading follows the %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" format. Let's take a look at each part:

  • %h: The first part of the log is the (127.0.0.1) IP address
  • %l: In the second part, the hyphen in the output indicates that the requested piece of information is not ...

Get JavaScript Regular Expressions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.