Most of my web career has been spent as a backend engineer. As such, I dutifully approached each performance project as an exercise in backend optimization, concentrating on compiler options, database indexes, memory management, etc. There’s a lot of attention and many books devoted to optimizing performance in these areas, so that’s where most people spend time looking for improvements. In reality, for most web pages, less than 10–20% of the end user response time is spent getting the HTML document from the web server to the browser. If you want to dramatically reduce the response times of your web pages, you have to focus on the other 80–90% of the end user experience. What is that 80–90% spent on? How can it be reduced? The chapters that follow lay the groundwork for understanding today’s web pages and provide 14 rules for making them faster.
In order to know what to improve, we need to know where the user
spends her time waiting. Figure 1-1 shows the HTTP traffic when
Yahoo!’s home page (http://www.yahoo.com) is
downloaded using Internet Explorer. Each bar is one HTTP request. The
first bar, labeled
html, is the
initial request for the HTML document. The browser parses the HTML and
starts downloading the components in the page. In this case, the
browser’s cache was empty, so all of the components had to be
downloaded. The HTML document is only 5% of the total response time. The
user spends most of the other 95% waiting for the components to
download; she also spends a small amount of time waiting for HTML,
scripts, and stylesheets to be parsed, as shown by the blank gaps
Figure 1-2 shows the same URL downloaded in Internet Explorer a second time. The HTML document is only 12% of the total response time. Most of the components don’t have to be downloaded because they’re already in the browser’s cache.
Five components are requested in this second page view:
This redirect was downloaded previously, but the browser is requesting it again. The HTTP response’s status code is 302 (“Found” or “moved temporarily”) and there is no caching information in the response headers, so the browser can’t cache the response. I’ll discuss HTTP in Chapter 2.
The next three requests are for images that were not downloaded in the initial page view. These are images for news photos and ads that change frequently.
The last HTTP request is a conditional GETrequest. The image is cached, but because of the HTTP response headers, the browser has to check that the image is up-to-date before showing it to the user. Conditional GET requests are also described in Chapter 2.
Looking at the HTTP traffic in this way, we see that at least 80% of the end user response time is spent on the components in the page. If we dig deeper into the details of these charts, we start to see how complex the interplay between browsers and HTTP becomes. Earlier, I mentioned how the HTTP status codes and headers affect the browser’s cache. In addition, we can make these observations:
Varying numbers of HTTP requests occur in parallel. Figure 1-2 has a maximum of three HTTP requests happening in parallel, whereas in Figure 1-1, there are as many as six or seven simultaneous HTTP requests. This behavior is due to the number of different hostnames being used, and whether they use HTTP/1.0 or HTTP/1.1. Chapter 8 explains these issues in the section "Parallel Downloads.”
Parallel requests don’t happen during requests for scripts. That’s because in most situations, browsers block additional HTTP requests while they download scripts. See Chapter 8 to understand why this happens and how to use this knowledge to improve page load times.
Figuring out exactly where the time goes is a challenge. But it’s easy to see where the time does not go—it does not go into downloading the HTML document, including any backend processing. That’s why frontend performance is important.
This phenomenon of spending only 10–20% of the response time downloading the HTML document is not isolated to Yahoo!’s home page. This statistic holds true for all of the Yahoo! properties I’ve analyzed (except for Yahoo! Search because of the small number of components in the page). Furthermore, this statistic is true across most web sites. Table 1-1 shows 10 top U.S. web sites extracted from http://www.alexa.com. Note that all of these except AOL were in the top 10 U.S. web sites. Craigslist.org was in the top 10, but its pages have little to no images, scripts, and stylesheets, and thus was a poor example to use. So, I chose to include AOL in its place.
All of these web sites spend less than 20% of the total response time retrieving the HTML document. The one exception is Google in the primed cache scenario. This is because http://www.google.com had only six components, and all but one were configured to be cached by the browser. On subsequent page views, with all those components cached, the only HTTP requests were for the HTML document and an image beacon.
In any optimization effort, it’s critical to profile current performance to identify where you can achieve the greatest improvements. It’s clear that the place to focus is frontend performance.
First, there is more potential for improvement in focusing on the frontend. If we were able to cut backend response times in half, the end user response time would decrease only 5–10% overall. If, instead, we reduce the frontend performance by half, we would reduce overall response times by 40–45%.
Second, frontend improvements typically require less time and fewer resources. Reducing backend latency involves projects such as redesigning application architecture and code, finding and optimizing critical code paths, adding or modifying hardware, distributing databases, etc. These projects take weeks or months. Most of the frontend performance improvements described in the following chapters involve best practices, such as changing web server configuration files (Chapter 5 and Chapter 6); placing scripts and stylesheets in certain places within the page (Chapter 7 and Chapter 8); and combining images, scripts, and stylesheets (Chapter 3). These projects take hours or days—much less than the time required for most backend improvements.
Third, frontend performance tuning has been proven to work. Over 50 teams at Yahoo! have reduced their end user response times by following the best practices described here, many by 25% or more. In some cases, we’ve had to go beyond these rules and identify improvements more specific to the site being analyzed, but generally, it’s possible to achieve a 25% or greater reduction just by following these best practices.
At the beginning of every new performance improvement project, I draw a picture like that shown in Figure 1-1 and explain the Performance Golden Rule:
Only 10–20% of the end user response time is spent downloading the HTML document. The other 80–90% is spent downloading all the components in the page.
Because some of the basic aspects of HTTP are necessary to understand parts of the book, I highlight them in Chapter 2.
After that come the 14 rules for faster performance, each in its own chapter. The rules are listed in general order of priority. A rule’s applicability to your specific web site may vary. For example, Rule 2 is more appropriate for commercial web sites and less feasible for personal web pages. If you follow all the rules that are applicable to your web site, you’ll make your pages 25–50% faster and improve the user experience. The last part of the book shows how to analyze the 10 top U.S. web sites from a performance perspective.