BUY THIS BOOK
Add to Cart

Print Book $29.99


Add to Cart

PDF $23.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £18.50

What is this?

Looking to Reprint or License this content?



Recent Forum Posts
High Performance Web Sites
High Performance Web Sites Essential Knowledge for Front-End Engineers

By Steve Souders
Book Price: $29.99 USD
£18.50 GBP
PDF Price: $23.99

Cover | Table of Contents | Forum | Colophon


Table of Contents

Chapter 1: The Importance of Frontend Performance
Most of my web career has been spent as a backend engineer. As such, I dutifully approached each performance project as an exercise in backend optimization, concentrating on compiler options, database indexes, memory management, etc. There's a lot of attention and many books devoted to optimizing performance in these areas, so that's where most people spend time looking for improvements. In reality, for most web pages, less than 10–20% of the end user response time is spent getting the HTML document from the web server to the browser. If you want to dramatically reduce the response times of your web pages, you have to focus on the other 80–90% of the end user experience. What is that 80–90% spent on? How can it be reduced? The chapters that follow lay the groundwork for understanding today's web pages and provide 14 rules for making them faster.
In order to know what to improve, we need to know where the user spends her time waiting. shows the HTTP traffic when Yahoo!'s home page (http://www.yahoo.com) is downloaded using Internet Explorer. Each bar is one HTTP request. The first bar, labeled html, is the initial request for the HTML document. The browser parses the HTML and starts downloading the components in the page. In this case, the browser's cache was empty, so all of the components had to be downloaded. The HTML document is only 5% of the total response time. The user spends most of the other 95% waiting for the components to download; she also spends a small amount of time waiting for HTML, scripts, and stylesheets to be parsed, as shown by the blank gaps between downloads.
Figure : Downloading http://www.yahoo.com in Internet Explorer, empty cache
shows the same URL downloaded in Internet Explorer a second time. The HTML document is only 12% of the total response time. Most of the components don't have to be downloaded because they're already in the browser's cache.
Figure : Downloading http://www.yahoo.com in Internet Explorer, primed cache
Five components are requested in this second page view:
One redirect
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Tracking Web Page Performance
In order to know what to improve, we need to know where the user spends her time waiting. shows the HTTP traffic when Yahoo!'s home page (http://www.yahoo.com) is downloaded using Internet Explorer. Each bar is one HTTP request. The first bar, labeled html, is the initial request for the HTML document. The browser parses the HTML and starts downloading the components in the page. In this case, the browser's cache was empty, so all of the components had to be downloaded. The HTML document is only 5% of the total response time. The user spends most of the other 95% waiting for the components to download; she also spends a small amount of time waiting for HTML, scripts, and stylesheets to be parsed, as shown by the blank gaps between downloads.
Figure : Downloading http://www.yahoo.com in Internet Explorer, empty cache
shows the same URL downloaded in Internet Explorer a second time. The HTML document is only 12% of the total response time. Most of the components don't have to be downloaded because they're already in the browser's cache.
Figure : Downloading http://www.yahoo.com in Internet Explorer, primed cache
Five components are requested in this second page view:
One redirect
This redirect was downloaded previously, but the browser is requesting it again. The HTTP response's status code is 302 ("Found" or "moved temporarily") and there is no caching information in the response headers, so the browser can't cache the response. I'll discuss HTTP in .
Three uncached images
The next three requests are for images that were not downloaded in the initial page view. These are images for news photos and ads that change frequently.
One cached image
The last HTTP request is a conditional GETrequest. The image is cached, but because of the HTTP response headers, the browser has to check that the image is up-to-date before showing it to the user. Conditional GET requests are also described in .
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Where Does the Time Go?
Looking at the HTTP traffic in this way, we see that at least 80% of the end user response time is spent on the components in the page. If we dig deeper into the details of these charts, we start to see how complex the interplay between browsers and HTTP becomes. Earlier, I mentioned how the HTTP status codes and headers affect the browser's cache. In addition, we can make these observations:
  • The cached scenario () doesn't have as much download activity. Instead, you can see a blank space with no downloads that occurs immediately following the HTML document's HTTP request. This is time when the browser is parsing HTML, JavaScript, and CSS, and retrieving components from its cache.
  • Varying numbers of HTTP requests occur in parallel. has a maximum of three HTTP requests happening in parallel, whereas in , there are as many as six or seven simultaneous HTTP requests. This behavior is due to the number of different hostnames being used, and whether they use HTTP/1.0 or HTTP/1.1. explains these issues in the section "."
  • Parallel requests don't happen during requests for scripts. That's because in most situations, browsers block additional HTTP requests while they download scripts. See to understand why this happens and how to use this knowledge to improve page load times.
Figuring out exactly where the time goes is a challenge. But it's easy to see where the time does not go—it does not go into downloading the HTML document, including any backend processing. That's why frontend performance is important.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Performance Golden Rule
This phenomenon of spending only 10–20% of the response time downloading the HTML document is not isolated to Yahoo!'s home page. This statistic holds true for all of the Yahoo! properties I've analyzed (except for Yahoo! Search because of the small number of components in the page). Furthermore, this statistic is true across most web sites. shows 10 top U.S. web sites extracted from http://www.alexa.com. Note that all of these except AOL were in the top 10 U.S. web sites. Craigslist.org was in the top 10, but its pages have little to no images, scripts, and stylesheets, and thus was a poor example to use. So, I chose to include AOL in its place.
Table : Percentage of time spent downloading the HTML document for 10 top web sites
Empty cache
Primed cache
AOL
6%
14%
Amazon
18%
14%
CNN
19%
8%
eBay
2%
8%
Google
14%
36%
MSN
3%
5%
MySpace
4%
14%
Wikipedia
20%
12%
Yahoo!
5%
12%
YouTube
3%
5%
All of these web sites spend less than 20% of the total response time retrieving the HTML document. The one exception is Google in the primed cache scenario. This is because http://www.google.com had only six components, and all but one were configured to be cached by the browser. On subsequent page views, with all those components cached, the only HTTP requests were for the HTML document and an image beacon.
In any optimization effort, it's critical to profile current performance to identify where you can achieve the greatest improvements. It's clear that the place to focus is frontend performance.
First, there is more potential for improvement in focusing on the frontend. If we were able to cut backend response times in half, the end user response time would decrease only 5–10% overall. If, instead, we reduce the frontend performance by half, we would reduce overall response times by 40–45%.
Second, frontend improvements typically require less time and fewer resources. Reducing backend latency involves projects such as redesigning application architecture and code, finding and optimizing critical code paths, adding or modifying hardware, distributing databases, etc. These projects take weeks or months. Most of the frontend performance improvements described in the following chapters involve best practices, such as changing web server configuration files ( and ); placing scripts and stylesheets in certain places within the page ( and ); and combining images, scripts, and stylesheets (). These projects take hours or days—much less than the time required for most backend improvements.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: HTTP Overview
Before diving into the specific rules for making web pages faster, it's important to understand the parts of the HyperText Transfer Protocol (HTTP) that affect performance. HTTP is how browsers and servers communicate with each other over the Internet. The HTTP specification was coordinated by the World Wide Web Consortium (W3C) and Internet Engineering Task Force (IETF), resulting in RFC 2616. HTTP/1.1 is the most common version today, but some browsers and servers still use HTTP/1.0.
HTTP is a client/server protocol made up of requests and responses. A browser sends an HTTP request for a specific URL, and a server hosting that URL sends back an HTTP response. Like many Internet services, the protocol uses a simple, plaintext format. The types of requests are GET, POST, HEAD, PUT, DELETE, OPTIONS, and TRACE. I'm going to focus on the GET request, which is the most common.
A GET request includes a URL followed by headers. The HTTP response contains a status code, headers, and a body. The following example shows the possible HTTP headers when requesting the script yahoo_2.0.0-b2.js.
GET /us.js.yimg.com/lib/common/utils/2/yahoo_2.0.0-b2.js HTTP/1.1
Host: us.js2.yimg.com
User-Agent: Mozilla/5.0 (...) Gecko/20061206 Firefox/1.5.0.9

HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GMT
Content-Length: 355

var YAHOO=...
The size of the response is reduced using compression if both the browser and server support it. Browsers announce their support of compression using the Accept-Encoding header. Servers identify compressed responses using the Content-Encoding header.
GET /us.js.yimg.com/lib/common/utils/2/yahoo_2.0.0-b2.js HTTP/1.1
Host: us.js2.yimg.com
User-Agent: Mozilla/5.0 (...) Gecko/20061206 Firefox/1.5.0.9
Accept-Encoding: gzip,deflate

HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GMT
Content-Length: 255
Content-Encoding: gzip

^_\;213^H^@^@^@^@^@^@^Cl\217\315j\3030^P\204_E\361IJ...
Notice how the body of the response is compressed. explains how to turn on compression, and warns about edge cases that can arise due to proxy caching. The
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Compression
The size of the response is reduced using compression if both the browser and server support it. Browsers announce their support of compression using the Accept-Encoding header. Servers identify compressed responses using the Content-Encoding header.
GET /us.js.yimg.com/lib/common/utils/2/yahoo_2.0.0-b2.js HTTP/1.1
Host: us.js2.yimg.com
User-Agent: Mozilla/5.0 (...) Gecko/20061206 Firefox/1.5.0.9
Accept-Encoding: gzip,deflate

HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GMT
Content-Length: 255
Content-Encoding: gzip

^_\;213^H^@^@^@^@^@^@^Cl\217\315j\3030^P\204_E\361IJ...
Notice how the body of the response is compressed. explains how to turn on compression, and warns about edge cases that can arise due to proxy caching. The Vary and Cache-Control headers are also discussed.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Conditional GET Requests
If the browser has a copy of the component in its cache, but isn't sure whether it's still valid, a conditional GET request is made. If the cached copy is still valid, the browser uses the copy from its cache, resulting in a smaller response and a faster user experience.
Typically, the validity of the cached copy is derived from the date it was last modified. The browser knows when the component was last modified based on the Last-Modified header in the response (refer to the previous sample responses). It uses the If-Modified-Since header to send the last modified date back to the server. The browser is essentially saying, "I have a version of this resource with the following last modified date. May I just use it?"
GET /us.js.yimg.com/lib/common/utils/2/yahoo_2.0.0-b2.js HTTP/1.1
Host: us.js2.yimg.com
User-Agent: Mozilla/5.0 (...) Gecko/20061206 Firefox/1.5.0.9
Accept-Encoding: gzip,deflate
If-Modified-Since: Wed, 22 Feb 2006 04:15:54 GMT
HTTP/1.1 304 Not Modified
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GM
If the component has not been modified since the specified date, the server returns a "304 Not Modified" status code and skips sending the body of the response, resulting in a smaller and faster response. In HTTP/1.1 the ETag and If-None-Match headers are another way to make conditional GET requests. Both approaches are discussed in .
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Expires
Conditional GET requests and 304 responses help pages load faster, but they still require making a roundtrip between the client and server to perform the validity check. The Expires header eliminates the need to check with the server by making it clear whether the browser can use its cached copy of a component.
HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GMT
Expires: Wed, 05 Oct 2016 19:16:20 GMT
When the browser sees an Expires header in the response, it saves the expiration date with the component in its cache. As long as the component hasn't expired, the browser uses the cached version and avoids making any HTTP requests. talks about the Expires and Cache-Control headers in more detail.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Keep-Alive
HTTP is built on top of Transmission Control Protocol (TCP). In early implementations of HTTP, each HTTP request required opening a new socket connection. This is inefficient because many HTTP requests in a web page go to the same server. For example, most requests for images in a web page all go to a common image server. Persistent Connections (also known as Keep-Alive in HTTP/1.0) was introduced to solve the inefficiency of opening and closing multiple socket connections to the same server. It lets browsers make multiple requests over a single connection. Browsers and servers use the Connection header to indicate Keep-Alive support. The Connection header looks the same in the server's response.
GET /us.js.yimg.com/lib/common/utils/2/yahoo_2.0.0-b2.js HTTP/1.1
Host: us.js2.yimg.com
User-Agent: Mozilla/5.0 (...) Gecko/20061206 Firefox/1.5.0.9
Accept-Encoding: gzip,deflate
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Wed, 22 Feb 2006 04:15:54 GMT
Connection: keep-alive
The browser or server can close the connection by sending a Connection: close header. Technically, the Connection: keep-alive header is not required in HTTP/1.1, but most browsers and servers still include it.
Pipelining, defined in HTTP/1.1, allows for sending multiple requests over a single socket without waiting for a response. Pipelining has better performance than persistent connections. Unfortunately, pipelining is not supported in Internet Explorer (up to and including version 7), and it's turned off by default in Firefox through version 2. Until pipelining is more widely adopted, Keep-Alive is the way browsers and servers can more efficiently use socket connections for HTTP. This is even more important for HTTPS because establishing new secure socket connections is more time consuming.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
There's More
This chapter contains just an overview of HTTP and focuses only on the aspects that affect performance. To learn more, read the HTTP specification (http://www.w3.org/protocols/rfc2616/rfc2616.html) and HTTP: The Definitive Guide by David Gourley and Brian Totty (O'Reilly; http://www.oreilly.com/catalog/httptdg). The parts highlighted here are sufficient for understanding the best practices described in the following chapters.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Rule 1: Make Fewer HTTP Requests
The Performance Golden Rule, as explained in , reveals that only 10–20% of the end user response time involves retrieving the requested HTML document. The remaining 80–90% of the time is spent making HTTP requests for all the components (images, scripts, stylesheets, Flash, etc.) referenced in the HTML document. Thus, a simple way to improve response time is to reduce the number of components, and, in turn, reduce the number of HTTP requests.
Suggesting the idea of removing components from the page often creates tension between performance and product design. In this chapter, I describe techniques for eliminating HTTP requests while avoiding the difficult tradeoff decisions between performance and design. These techniques include using image maps, CSS sprites, inline images, and combined scripts and stylesheets. Using these techniques reduces response times of the example pages by as much as 50%.
In its simplest form, a hyperlink associates the destination URL with some text. A prettier alternative is to associate the hyperlink with an image, for example in navbars and buttons. If you use multiple hyperlinked images in this way, image maps may be a way to reduce the number of HTTP requests without changing the page's look and feel. An image map allows you to associate multiple URLs with a single image. The destination URL is chosen based on where the user clicks on the image.
shows an example of five images used in a navbar. Clicking on an image takes you to the associated link. This could be done with five separate hyperlinks, using five separate images. It's more efficient, however, to use an image map because this reduces the five HTTP requests to just one HTTP request. The response time is faster because there is less HTTP overhead.
Figure : Image map candidate
You can try this out for yourself by visiting the following URLs. Click on each link to see the roundtrip retrieval time.
When using Internet Explorer 6.0 over DSL (~900 Kbps), the image map retrieval was 56% faster
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Image Maps
In its simplest form, a hyperlink associates the destination URL with some text. A prettier alternative is to associate the hyperlink with an image, for example in navbars and buttons. If you use multiple hyperlinked images in this way, image maps may be a way to reduce the number of HTTP requests without changing the page's look and feel. An image map allows you to associate multiple URLs with a single image. The destination URL is chosen based on where the user clicks on the image.
shows an example of five images used in a navbar. Clicking on an image takes you to the associated link. This could be done with five separate hyperlinks, using five separate images. It's more efficient, however, to use an image map because this reduces the five HTTP requests to just one HTTP request. The response time is faster because there is less HTTP overhead.
Figure : Image map candidate
You can try this out for yourself by visiting the following URLs. Click on each link to see the roundtrip retrieval time.
When using Internet Explorer 6.0 over DSL (~900 Kbps), the image map retrieval was 56% faster than the retrieval for the navbar with separate images for each hyperlink (354 milliseconds versus 799 milliseconds). That's because the image map has four fewer HTTP requests.
There are two types of image maps. Server-side image maps submit all clicks to the same destination URL, passing along the x,y coordinates of where the user clicked. The web application maps the x,y coordinates to the appropriate action. Client-side image maps are more typical because they map the user's click to an action without requiring a backend application. The mapping is achieved via HTML's MAP tag. The HTML for converting the navbar in to an image map shows how the MAP tag is used:
<img usemap="#map1" border=0 src="/images/imagemap.gif">
<map name="map1">
  <area shape="rect" coords="0,0,31,31" href="home.html" title="Home">
  <area shape="rect" coords="36,0,66,31" href="gifts.html" title="Gifts">
  <area shape="rect" coords="71,0,101,31" href="cart.html" title="Cart">
  <area shape="rect" coords="106,0,136,31" href="settings.html" title="Settings">
  <area shape="rect" coords="141,0,171,31" href="help.html" title="Help">
</map>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
CSS Sprites
Like image maps, CSS sprites allow you to combine images, but they're much more flexible. The concept reminds me of a Ouija board, where the planchette (the viewer that all participants hold on to) moves around the board stopping over different letters. To use CSS sprites, multiple images are combined into a single image, similar to the one shown in . This is the "Ouija board."
Figure : CSS sprites combine multiple images into a single image
The "planchette" is any HTML element that supports background images, such as a SPAN or DIV. The HTML element is positioned over the desired part of the background image using the CSS background-position property. For example, you can use the "My" icon for an element's background image as follows:
<div style="background-image: url('a_lot_of_sprites.gif');
             background-position: −260px −90px;
             width: 26px; height: 24px;">
</div>
I modified the previous image map example to use CSS sprites. The five links are contained in a DIV named navbar. Each link is wrapped around a SPAN that uses a single background image, spritebg.gif, as defined in the #navbar span rule. Each SPAN has a different class that specifies the offset into the CSS sprite using the background-position property:
<style>
#navbar span {
  width:31px;
  height:31px;
  display:inline;
  float:left;
  background-image:url(/images/spritebg.gif);
}
.home     { background-position:0 0; margin-right:4px; margin-left: 4px;}
.gifts    { background-position:-32px 0; margin-right:4px;}
.cart     { background-position:-64px 0; margin-right:4px;}
.settings { background-position:-96px 0; margin-right:4px;}
.help     { background-position:-128px 0; margin-right:0px;}
</style>

<div id="navbar" style="background-color: #F4F5EB; border: 2px ridge #333; width:
180px; height: 32px; padding: 4px 0 4px 0;">
  <a href="javascript:alert('Home')"><span class="home"></span></a>
  <a href="javascript:alert('Gifts')"><span class="gifts"></span></a>
  <a href="javascript:alert('Cart')"><span class="cart"></span></a>
  <a href="javascript:alert('Settings')"><span class="settings"></span></a>
  <a href="javascript:alert('Help')"><span class="help"></span></a>
</div>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Inline Images
It's possible to include images in your web page without any additional HTTP requests by using the data: URL scheme. Although this approach is not currently supported in Internet Explorer, the savings it can bring to other browsers makes it worth mentioning.
We're all familiar with URLs that include the http: scheme. Other schemes include the familiar ftp:, file:, and mailto: schemes. But there are many more schemes, such as smtp:, pop:, dns:, whois:, finger:, daytime:, news:, and urn:. Some of these are officially registered; others are accepted because of their common usage.
The data: URL scheme was first proposed in 1995. The specification (http://tools.ietf.org/html/rfc2397) says it "allows inclusion of small data items as 'immediate' data." The data is in the URL itself following this format:
data:[<mediatype>][;base64],<data>
An inline image of a red star is specified as:
<IMG ALT="Red Star"
SRC="data:image/gif;base64,R0lGODlhDAAMALMLAPN8ffBiYvWW
lvrKy/FvcPewsO9VVfajo+w6O/zl5estLv/8/AAAAAAAAAAAAAAAACH5BAEA
AAsALAAAAAAMAAwAAAQzcElZyryTEHyTUgknHd9xGV+qKsYirKkwDYiKDBia
tt2H1KBLQRFIJAIKywRgmhwAIlEEADs=">
I've seen data: used only for inline images, but it can be used anywhere a URL is specified, including SCRIPT and A tags.
The main drawback of the data: URL scheme is that it's not supported in Internet Explorer (up to and including version 7). Another drawback is its possible size limitations, but Firefox 1.5 accepts inline images up to 100K. The base64 encoding increases the size of images, so the total size downloaded is increased.
The navbar from previous sections is implemented using inline images in the following example.
Because data: URLs are embedded in the page, they won't be cached across different pages. You might not want to inline your company logo, because it would make every page grow by the encoded size of the logo. A clever way around this is to use CSS and inline the image as a background. Placing this CSS rule in an external stylesheet means that the data is cached inside the stylesheet. In the following example, the background images used for each link in the navbar are implemented using inline images in an external stylesheet.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Combined Scripts and Stylesheets
JavaScript and CSS are used on most web sites today. Frontend engineers must choose whether to "inline" their JavaScript and CSS (i.e., embed it in the HTML document) or include it from external script and stylesheet files. In general, using external scripts and stylesheets is better for performance (this is discussed more in ). However, if you follow the approach recommended by software engineers and modularize your code by breaking it into many small files, you decrease performance because each file results in an additional HTTP request.
shows that 10 top web sites average six to seven scripts and one to two stylesheets on their home pages. These web sites were selected from http://www.alexa.com, as described in . Each of these sites requires an additional HTTP request if it's not cached in the user's browser. Similar to the benefits of image maps and CSS sprites, combining these separate files into one file reduces the number of HTTP requests and improves the end user response time.
To be clear, I'm not suggesting combining scripts with stylesheets. Multiple scripts should be combined into a single script, and multiple stylesheets should be combined into a single stylesheet. In the ideal situation, there would be no more than one script and one stylesheet in each page.
The following examples show how combining scripts improves the end user response time. The page with the combined scripts loads 38% faster. Combining stylesheets produces similar performance improvements. For the rest of this section I'll talk only about scripts (because they're used in greater numbers), but everything discussed applies equally to stylesheets.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Conclusion
This chapter covered the techniques we've used at Yahoo! to reduce the number of HTTP requests in web pages without compromising the pages' design. The rules described in later chapters also present guidelines that help reduce the number of HTTP requests, but they focus primarily on subsequent page views. For components that are not critical to the initial rendering of the page, the post-onload download technique described in helps by postponing these HTTP requests until after the page is loaded.
This chapter's rule is the one that is most effective in reducing HTTP requests for first-time visitors to your web site; that's why I put it first, and why it's the most important rule. Following its guidelines improves both first-time views and subsequent views. A fast response time on that first page view can make the difference between a user who abandons your site and one who comes back again and again.
Make fewer HTTP requests.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Rule 2: Use a Content Delivery Network
The average user's bandwidth increases every year, but a user's proximity to your web server still has an impact on a page's response time. Web startups often have all their servers in one location. If they survive the startup phase and build a larger audience, these companies face the reality that a single server location is no longer sufficient—it's necessary to deploy content across multiple, geographically dispersed servers.
As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, a redesign could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this redesign step.
The correct first step is found by recalling the Performance Golden Rule, described in :
Only 10–20% of the end user response time is spent downloading the HTML document. The other 80–90% is spent downloading all the components in the page.
If the application web servers are closer to the user, the response time of one HTTP request is improved. On the other hand, if the component web servers are closer to the user, the response times of many HTTP requests are improved. Rather than starting with the difficult task of redesigning your application in order to disperse the application web servers, it's better to first disperse the component web servers. This not only achieves a bigger reduction in response times, it's also easier thanks to content delivery networks.
A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content to users more efficiently. This efficiency is typically discussed as a performance issue, but it can also result in cost savings. When optimizing for performance, the server selected for delivering content to a specific user is based on a measure of network proximity. For example, the CDN may choose the server with the fewest network hops or the server with the quickest response time.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Content Delivery Networks
A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content to users more efficiently. This efficiency is typically discussed as a performance issue, but it can also result in cost savings. When optimizing for performance, the server selected for delivering content to a specific user is based on a measure of network proximity. For example, the CDN may choose the server with the fewest network hops or the server with the quickest response time.
Some large Internet companies own their own CDN, but it's cost effective to use a CDN service provider. Akamai Technologies, Inc. is the industry leader. In 2005, Akamai acquired Speedera Networks, the primary low-cost alternative. Mirror Image Internet, Inc. is now the leading alternative to Akamai. Limelight Networks, Inc. is another competitor. Other providers, such as SAVVIS Inc., specialize in niche markets such as video content delivery.
shows 10 top Internet sites in the U.S. and the CDN service providers they use.
You can see that:
  • Five use Akamai
  • One uses Mirror Image
  • One uses Limelight
  • One uses SAVVIS
  • Four either don't use a CDN or use a homegrown CDN solution
Smaller and noncommercial web sites might not be able to afford the cost of these CDN services. There are several free CDN services available. Globule (http://www.globule.org) is an Apache module developed at Vrije Universiteit in Amsterdam. CoDeeN (http://codeen.cs.princeton.edu) was built at Princeton University on top of PlanetLab. CoralCDN (http://www.coralcdn.org) is run out of New York University. They are deployed in different ways. Some require that end users configure their browsers to use a proxy. Others require developers to change the URL of their components to use a different hostname. Be wary of any that use HTTP redirects to point users to a local server, as this slows down web pages (see ).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Savings
The two online examples discussed in this section demonstrate the response time improvements gained from using a CDN. Both examples include the same test components: five scripts, one stylesheet, and eight images. In the first example, these components are hosted on the Akamai Technologies CDN. In the second example, they are hosted on a single web server.
The example with components hosted on the CDN loaded 18% faster than the page with all components hosted from a single web server (1013 milliseconds versus 1232 milliseconds). I tested this over DSL (~900 Kbps) from my home in California. Your results will vary depending on your connection speed and geographic location. The single web server is located near Washington, DC. The closer you live to Washington, DC, the less of a difference you'll see in response times in the CDN example.
If you conduct your own response time tests to gauge the benefits of using a CDN, it's important to keep in mind that the location from which you run your test has an impact on the results. For example, based on the assumption that most web companies choose a data center close to their offices, your web client at work is probably located close to your current web servers. Thus, if you run a test from your browser at work, the response times without using a CDN are often best case. It's important to remember that most of your users are not located that close to your web servers. To measure the true impact of switching to a CDN, you need to measure the response times from multiple geographic locations. Services such as Keynote Systems (http://www.keynote.com) and Gomez (http://www.gomez.com) are helpful for conducting such tests.
At Yahoo!, this factor threw us off for awhile. Before switching Yahoo! Shopping to Akamai, our preliminary tests were run from a lab at Yahoo! headquarters, located near a Yahoo! data center. The response time improvements gained by switching to Akamai's CDN—as measured from that lab—were less than 5% (not very impressive). We knew the response time improvements would be better when we exposed the CDN change to our live users, spread around the world. When we exposed the change to end users, there was an overall 20% reduction in response times on the Yahoo! Shopping site, just from moving all the static components to a CDN.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Rule 3: Add an Expires Header
Fast response time is not your only consideration when designing web pages. If it were, then we'd all take Rule 1 to an extreme and place no images, scripts, or stylesheets in our pages. However, we all understand that images, scripts, and stylesheets can enhance the user experience, even if it means that the page will take longer to load. Rule 3, described in this chapter, shows how you can improve page performance by making sure these components are configured to maximize the browser's caching capabilities.
Today's web pages include many components and that number continues to grow. A first-time visitor to your page may have to make several HTTP requests, but by using a future Expires header, you make those components cacheable. This avoids unnecessary HTTP requests on subsequent page views. A future Expires header is most often used with images, but it should be used on all components, including scripts, stylesheets, and Flash. Most top web sites are not currently doing this. In this chapter, I point out these sites and show why their pages aren't as fast as they could be. Adding a future Expires header incurs some additional development costs, as described in the section "."
Browsers (and proxies) use a cache to reduce the number of HTTP requests and decrease the size of HTTP responses, thus making web pages load faster. A web server uses the Expires header to tell the web client that it can use the current copy of a component until the specified time. The HTTP specification summarizes this header as "the date/time after which the response is considered stale." It is sent in the HTTP response.
Expires: Thu, 15 Apr 2010 20:00:00 GMT
This is a far future Expires header, telling the browser that this response won't be stale until April 15, 2010. If this header is returned for an image in a page, the browser uses the cached image on subsequent page views, reducing the number of HTTP requests by one. See for a review of the Expires header and HTTP.
Before I explain how better caching improves performance, it's important to mention an alternative to the Expires
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Expires Header
Browsers (and proxies) use a cache to reduce the number of HTTP requests and decrease the size of HTTP responses, thus making web pages load faster. A web server uses the Expires header to tell the web client that it can use the current copy of a component until the specified time. The HTTP specification summarizes this header as "the date/time after which the response is considered stale." It is sent in the HTTP response.
Expires: Thu, 15 Apr 2010 20:00:00 GMT
This is a far future Expires header, telling the browser that this response won't be stale until April 15, 2010. If this header is returned for an image in a page, the browser uses the cached image on subsequent page views, reducing the number of HTTP requests by one. See for a review of the Expires header and HTTP.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Max-Age and mod_expires
Before I explain how better caching improves performance, it's important to mention an alternative to the Expires header. The Cache-Control header was introduced in HTTP/1.1 to overcome limitations with the Expires header. Because the Expires header uses a specific date, it has stricter clock synchronization requirements between server and client. Also, the expiration dates have to be constantly checked, and when that future date finally arrives, a new date must be provided in the server's configuration.
Alternatively, Cache-Control uses the max-age directive to specify how long a component is cached. It defines the freshness window in seconds. If less than max-age seconds have passed since the component was requested, the browser will use the cached version, thus avoiding an additional HTTP request. A far future max-age header might set the freshness window 10 years in the future.
Cache-Control: max-age=315360000
Using Cache-Control with max-age overcomes the limitations of Expires, but you still might want an Expires header for browsers that don't support HTTP/1.1 (even though this is probably less than 1% of your traffic). You could specify both response headers, Expires and Cache-Control max-age. If both are present, the HTTP specification dictates that the max-age directive will override the Expires header. However, if you're conscientious, you'll still worry about the clock synchronization and configuration maintenance issues with Expires.
Fortunately, the mod_expires Apache module (http://httpd.apache.org/docs/2.0/mod/mod_expires.html) lets you use an Expires header that sets the date in a relative fashion similar to max-age. This is done via the ExpiresDefault directive. In this example, the expiration date for images, scripts, and stylesheets is set 10 years from the time of the request:
<FilesMatch "\.(gif|jpg|js|css)$">
  ExpiresDefault "access plus 10 years"
</FilesMatch>
The time can be specified in years, months, weeks, days, hours, minutes, or seconds. It sends both an Expires header and a Cache-Control max-age header in the response.
Expires: Sun, 16 Oct 2016 05:43:02 GMT
Cache-Control: max-age=315360000
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Empty Cache vs. Primed Cache
Using a far future Expires header affects page views only after a user has already visited your site. It has no effect on the number of HTTP requests when a user visits your site for the first time and the browser's cache is empty. Therefore, the impact of this performance improvement depends on how often users hit your pages with a primed cache. It's likely that a majority of your traffic comes from users with a primed cache. Making your components cacheable improves the response time for these users.
When I say "empty cache" or "primed cache," I mean the state of the browser's cache relative to your page. The cache is "empty" if none of your page's components are in the cache. The browser's cache might contain components from other web sites, but that doesn't help your page. Conversely, the cache is "primed" if all of your page's cacheable components are in the cache.
The number of empty versus primed cache page views depends on the nature of the web application. A site like "word of the day" might only get one page view per session from the typical user. There are several reasons why the "word of the day" components might not be in the cache the next time a user visits the site:
  • Despite her desire for a better vocabulary, a user may visit the page only weekly or monthly, rather than daily.
  • A user may have manually cleared her cache since her last visit.
  • A user may have visited so many other web sites that her cache filled up, and the "word of the day" components were pushed out.
  • The browser or an antivirus application may have cleared the cache when the browser was closed.
With only one page view per session, it's not very likely that "word of the day" components are in the cache, so the percentage of primed cache page views is low.
On the other hand, a travel or email web site might get multiple page views per user session and the number of primed cache page views is likely to be high. In this instance, more page views will find your components in the browser's cache.
We measured this at Yahoo! and found that the number of unique users who came in at least once a day with a primed cache ranged from 40–60%
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
More Than Just Images
Using a far future Expires header on images is fairly common, but this best practice should not be limited to images. A far future Expires header should be included on any component that changes infrequently, including scripts, stylesheets, and Flash components. Typically, an HTML document won't have a future Expires header because it contains dynamic content that is updated on each user request.
In the ideal situation, all the components in a page would have a far future Expires header, and subsequent page views would make just a single HTTP request for the HTML document. When all of the document's components are read from the browser's cache, the response time is cut by 50% or more.
I surveyed 10 top Internet sites in the U.S and recorded how many of the images, scripts, and stylesheets had an Expires or a Cache-Control max-age header set at least 30 days in the future. As shown in , the news isn't good. Three types of components are tallied: images, stylesheets, and scripts. shows the number of components that are cacheable for at least 30 days out of the total number of components of each type. Let's see to what extent these sites employ the practice of making their components cacheable:
  • Five sites make a majority of their images cacheable for 30 days or more.
  • Four sites make a majority of their stylesheets cacheable for 30 days or more.
  • Two sites make a majority of their scripts cacheable for 30 days or more.
Table : Components with an Expires header
Web site
Images
Stylesheets
Scripts