The following sections describe the HTTP headers that specify the type and length of the content, and the version of the content being sent. Note that in this section we often use the term message. This term is used to describe the data that comprises the HTTP headers along with their associated content; the content is the actual page, image, file, etc.
included in every set of headers, according to the standard, and
Apache will generate one if your code doesn't. It
will be whatever is specified in the relevant
DefaultType configuration directive, or
text/plain if none is active.
According to section 14.13 of
the HTTP specification, the
Content-Length header is the number of octets
(8-bit bytes) in the body of a message. If the length can be
determined prior to sending, it can be very useful to include it. The
most important reason is that
(when the same connection is used to fetch more than one object from
the web server) work only with responses that contain a
Content-Length header. In mod_perl we can write:
Content-Length header can have a significant
impact on caches by invalidating cache entries, as the following
extract from the specification explains:
The response to a HEAD request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.
It is important not to send an erroneous
Content-Length header in a response to either a
GET or a
An entity tag (
ETag) is a validator
that can be used instead of, or in addition
Last-Modified header; it is a quoted
string that can be used to identify different versions of a
particular resource. An entity tag can be added to the response
headers like this:
mod_perl offers the
$r->set_etag( ) method if
However, we strongly recommend that you don't use
set_etag( ) method!
) is meant to be used in conjunction with a static request
for a file on disk that has been
stat( )ed in the
course of the current request. It is inappropriate and dangerous to
use it for dynamic content.
By sending an entity tag we are promising the recipient that we will
not send the same
ETag for the same resource again
unless the content is "equal" to
what we are sending now.
The pros and cons of using entity tags are discussed in section 13.3 of the HTTP specification. For mod_perl programmers, that discussion can be summed up as follows.
There are strong and weak validators. Strong validators change whenever a single bit changes in the response; i.e., when anything changes, even if the meaning is unchanged. Weak validators change only when the meaning of the response changes. Strong validators are needed for caches to allow for sub-range requests. Weak validators allow more efficient caching of equivalent objects. Algorithms such as MD5 or SHA are good strong validators, but what is usually required when we want to take advantage of caching is a good weak validator.
Last-Modified time, when used as a validator in
a request, can be strong or weak, depending on a couple of rules
described in section 13.3.3 of the HTTP standard. This is mostly
relevant for range requests, as this quote from section 14.27
If the client has no entity tag for an entity, but does have a Last-Modified date, it MAY use that date in an If-Range header.
But it is not limited to range requests. As section 13.3.1 states,
the value of the
Last-Modified header can also be
used as a cache validator.
The fact that a
Last-Modified date may be used as
a strong validator can be pretty disturbing if we are in fact
changing our output slightly without changing its semantics. To
prevent this kind of misunderstanding between us and the cache
servers in the response chain, we can send a weak validator in an
ETag header. This is possible because the
If a client wishes to perform a sub-range retrieval on a value for which it has only a Last-Modified time and no opaque validator, it MAY do this only if the Last- Modified time is strong in the sense described here.
In other words, by sending an
ETag that is marked
as weak, we prevent the cache server from using the
Last-Modified header as a strong validator.
ETag value is marked as a weak validator by
prepending the string
W/ to the quoted string;
otherwise, it is strong. In Perl this would mean something like this:
Consider carefully which string is chosen to act as a validator. We are on our own with this decision:
... only the service author knows the semantics of a resource well enough to select an appropriate cache validation mechanism, and the specification of any validator comparison function more complex than byte-equality would open up a can of worms. Thus, comparisons of any other headers (except Last-Modified, for compatibility with HTTP/1.0) are never used for purposes of validating a cache entry.
If we are composing a message from multiple components, it may be necessary to combine some kind of version information for all these components into a single string.
If we are producing relatively large documents, or content that does not change frequently, then a strong entity tag will probably be preferred, since this will give caches a chance to transfer the document in chunks.