HTTP: The Definitive Guide
by David Gourley, Brian Totty, Marjorie Sayer, Anshu Aggarwal, Sailu Reddy
What to Log?
For the most part, logging is done for two reasons: to look for problems on the server or proxy (e.g., which requests are failing), and to generate statistics about how web sites are accessed. Statistics are useful for marketing, billing, and capacity planning (for instance, determining the need for additional servers or bandwidth).
You could log all of the headers in an HTTP transaction, but for servers and proxies that process millions of transactions per day, the sheer bulk of all of that data quickly would get out of hand. You also would end up logging a lot of information that you don’t really care about and may never even look at.
Typically, just the basics of a transaction are logged. A few examples of commonly logged fields are:
HTTP method
HTTP version of client and server
URL of the requested resource
HTTP status code of the response
Size of the request and response messages (including any entity bodies)
Timestamp of when the transaction occurred
Referer and User-Agent header values
The HTTP method and URL tell what the request was trying to do—for example, GETting a resource or POSTing an order form. The URL can be used to track popularity of pages on the web site.
The version strings give hints about the client and server, which are useful in debugging strange or unexpected interactions between clients and servers. For example, if requests are failing at a higher-than-expected rate, the version information may point to a new release of a browser that is unable ...