Content Types
Table A-1 shows the breakdown of responses by content type. As you can see, images make up about 60% of all web requests by count and 40% by volume. The top three content types—GIFs, JPEGs, and HTML—account for 95% of all requests and 63% of all traffic volume.
Table A-1. The Most Popular Content Types (IRCache Data)
| Content Type | Count % | Volume % | Mean Size, KB |
|---|---|---|---|
| image/gif | 40.8 | 16.6 | 3.75 |
| text/html | 35.1 | 23.1 | 6.07 |
| image/jpeg | 19.0 | 22.9 | 11.12 |
| text/plain | 1.7 | 2.5 | 13.45 |
| application/x-javascript | 1.5 | 0.2 | 1.45 |
| application/octet-stream | 0.5 | 10.2 | 179.22 |
| application/zip | 0.1 | 8.0 | 684.14 |
| video/mpeg | 0.0 | 3.4 | 761.90 |
| application/pdf | 0.0 | 1.3 | 336.30 |
| audio/mpeg | 0.0 | 2.5 | 1707.70 |
| video/quicktime | 0.0 | 1.2 | 1205.42 |
| All others | 1.1 | 8.1 | 69.60 |
This data is derived from the fifth and tenth fields of Squid’s access.log file. The logs include many responses without a content type, such as 302 (Found) and 304 (Not Modified). All non-200 status responses without a content type have been filtered out. I have not eliminated the effects of popularity. Thus, these numbers represent the percentage of requests made by clients rather than the percentage of content that lives at origin servers.
Figure A-3 shows some long-term trends of the three most popular content types and JavaScript. The percentage of JPEG images remains more or less constant at about 20%. GIF requests seem to have a decreasing trend, and HTML has a corresponding increasing trend. The GIF and JPEG traces are very periodic. The peaks and valleys correspond to weekends and weekdays. ...