Chapter 8. Load Balancing and Caching
This chapter looks at two important concepts for bringing a service-oriented design to production: load balancing and caching. First, load balancing is used to distribute the workload of service requests across multiple processes and servers to increase the reliability and capacity of a system. Second, services apply caching strategies, with HTTP headers and Memcached, to improve response times by reducing the amount of work the services have to perform.
Latency and Throughput
Before diving into the details of load balancing and caching, we need to establish the concepts of latency and throughput. Latency refers to the elapsed time per request, usually measured in milliseconds. Throughput refers to the ...