Capacity planning is another important part of tuning the performance of your Tomcat server in production. Regardless of how much configuration file-tuning and testing you do, it won't really help if you don't have the hardware and bandwidth your site needs to serve the volume of traffic you are expecting.
Here's a loose definition of capacity planning as it fits into the context of this section: capacity planning is the activity of estimating the necessary computer hardware, operating system, and bandwidth necessary for a web site by studying and/or estimating the total network traffic a site will have to handle, deciding on acceptable service characteristics, and finding appropriate hardware and operating systems that meet or exceed the server software's requirements to meet the service requirements. In this case, the server software includes Tomcat, as well as any third-party web servers and load balancers that you are using "in front" of Tomcat.
If you don't do any capacity planning before you buy and deploy your production servers, you won't know if the server hardware can handle your web site's traffic load. Or, worse still, you won't realize the error until you've already ordered, paid for, and deployed applications on the hardware—usually too late to change direction very much. You can usually add a larger hard drive or even order more server computers, but sometimes it's less expensive overall to buy and/or maintain fewer server computers in the first place.
The higher the volume of traffic on your web site, or the larger the load that is generated per client request, the more important capacity planning becomes. Some sites get so much traffic that only a cluster of server computers can handle it all within reasonable response time limits. Conversely, sites with less traffic have less of a problem finding hardware that meets all their requirements. It's true that throwing more or bigger hardware at the problem usually fixes things, but, especially in the high traffic cases, that may be prohibitively costly. For most companies, the lower the hardware costs are (including ongoing maintenance costs after the initial purchase), the higher profits can be. Another factor to consider is employee productivity. If having faster hardware would make the developers 20 percent more effective in getting their work done quickly, for example, then depending on the size of the team, it may be worth the hardware cost difference to order bigger/faster hardware up front.
Capacity planning is usually done at upgrade points as well. Before ordering replacement hardware for existing mission-critical server computers, it's probably a good idea to gather information about what your company needs, based on updated requirements, common traffic load, software footprints, etc.
There are at least a couple of common methods of arriving at decisions when conducted capacity planning. In practice, we've seen two main types: anecdotal approaches and academic approaches, such as enterprise capacity planning.
Anecdotal capacity planning is a sort of light capacity planning that isn't meant to be exact, but close enough to keep a company out of situations that would be caused by doing no capacity planning at all. This method follows capacity and performance trends that are obtained from previous industry experience. For example, you could make your best educated guess at how much outgoing network traffic your site will have at its peak usage (hopefully from some other real-world site), and double that figure. That figure is your site's new outgoing bandwidth requirement for which you will make sure to buy and deploy hardware that can handle it. Most people will do capacity planning this way because it's quick and requires little effort and time.
Enterprise capacity planning is meant to be more exact and takes much longer. This method is necessary for sites with a very high volume of traffic, often combined with a high load per request. Detailed capacity planning like this is necessary to keep hardware and bandwidth costs as low as they can be, while still providing the quality of service that the company guarantees or is contractually obligated to live up to. Usually, this involves the use of commercial capacity planning analysis software in addition to iterative testing and modeling. Few companies do this kind of capacity planning, but the few that do are very large enterprises that have a budget large enough to afford doing it (mainly because this sort of thorough planning ends up paying for itself).
The biggest difference between anecdotal and enterprise capacity planning is depth. Anecdotal capacity planning is governed by rules of thumb and is more of an educated guess, whereas enterprise capacity planning is an in-depth requirements-and-performance study whose goal is to arrive at numbers that are as exact as possible.
Which computer architecture(s)? How many computers will your site need? One big one? Many smaller ones? How many CPUs per computer? How much RAM? How much hard drive space and what speed I/O? What will the ongoing maintenance be like? How does switching to different JVM implementations affect the hardware requirements?
How much incoming and outgoing bandwidth will be needed at peak times? How might the web application be modified to lower these requirements?
Which operating system works best for the job of serving your site? Which JVM implementations are available for each operating system, and how well does each one take advantage of the operating system? For example, does the JVM support native multithreading? Symmetric multiprocessing (SMP)? If SMP is supported by the JVM, should you consider multiprocessor server computer hardware? Which serves your webapp faster, more reliably, and less expensively: multiple single-processor server computers or a single four-CPU server computer?
Characterize the workload. If your site is already up and running, you can measure the requests per second, summarize the different kinds of possible requests, and measure the resource utilization per request type. If your site isn't running yet, you can make some educated guesses at the request volume and run staging tests to determine the resource requirements.
Analyze performance trends. You need to know what requests generate the most load and how other requests are in comparison. Knowing which requests generate the most load or use the most resources, will help you know what to optimize to have the best overall impact on your server computers. For example, if a servlet that queries a database takes too long to send its response, maybe caching some of the data in RAM would safely improve response time.
Decide on minimum acceptable service requirements. For example, you may not want the end user to ever wait longer than 20 seconds for a web page response. That means that even during peak load, no request's total time from the initial request to the completion of the response can take longer than 20 seconds. That may include any and all database queries and filesystem access needed to complete the heaviest resource-intensive request in your application. The minimum acceptable service requirements are up to each company and vary from company to company. Other kinds of service minimums include the number of requests per second the site must be able to serve and the minimum number of concurrent sessions and users.
Decide what infrastructure resources you will use, and test it in a staging environment. Infrastructure resources include computer hardware, bandwidth circuits, operating system software, and so on. Order, deploy, and test at least one server machine that mirrors what you'll have for production and see if it meets your requirements. While testing Tomcat, make sure you try more than one JVM implementation, try different memory size settings, and request thread pool sizes (discussed earlier in this chapter).
If step 4 meets your service requirements, you can order and deploy more of the same thing to use as your production server computers. Otherwise, redo step 4 until service requirements are met.
Be sure to document your work because it tends to be a time-consuming process that must be repeated if someone needs to know how your company arrived at the answers. Also, because the testing is an iterative process, it's important to document all of the test results on each iteration and the configuration settings that produced the results so you know when your tuning is no longer yielding noticeable positive results.
Once you've finished with your capacity planning, your site will be much better tuned for performance, mainly due to the rigorous testing of a variety of options. You should have gained a noticeable amount of performance just by having the right hardware, operating system, and JVM combination for your particular use of Tomcat.