A number of tools are available for benchmarking proxy caches. Some are self-contained because they generate all requests and responses internally. Others rely on trace log files for requests and on live origin servers for responses. Each technique has advantages and disadvantages.
Using trace files is attractive because the client and server
programs are simpler to implement. A self-contained benchmark
is more complicated because it uses mathematical formulas to
generate new requests and responses. For example, a
particular request has some probability of being a cache hit,
of being cachable, and of being a certain size.
With trace files, instead of managing complex workload
parameters, the client just reads a file of URLs and sends
HTTP requests. In essence, the workload parameters are
embedded in the log files.
Another problem is that trace files don’t normally record
all the information needed to correctly play back the
requests. For example, a log file doesn’t normally say
if a particular request was on a persistent connection. It’s
also unlikely to indicate certain request headers, such
Trace log files are usually taken from production proxy caches. This is good because the trace represents real web traffic on your network, generated by real users. If you want to run a trace-based benchmark or simulation but don’t have any log files, you might be out of luck. Log files are not usually shared between organizations ...