Chapter 8. Load Balancing and Caching

This chapter looks at two important concepts for bringing a service-oriented design to production: load balancing and caching. First, load balancing is used to distribute the workload of service requests across multiple processes and servers to increase the reliability and capacity of a system. Second, services apply caching strategies, with HTTP headers and Memcached, to improve response times by reducing the amount of work the services have to perform.

Latency and Throughput

Before diving into the details of load balancing and caching, we need to establish the concepts of latency and throughput. Latency refers to the elapsed time per request, usually measured in milliseconds. Throughput refers to the ...

Get Service-Oriented Design with Ruby and Rails now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.