Scalability and Bottlenecks
Before we jump into the patterns, let’s take a minute to discuss what we mean by a scalable system. Think of a web-based system as a request processor. Requests come in from the clients, and the clients wait until results are generated. Everything in between—whether it’s simply returning the contents of a static file or generating a fully dynamic page—is the actual processing.
For a request processor, scalability is related to the number of requests that can be processed simultaneously. In a simple sense, scalability might be the ability to “survive” a certain number of hits at the same time, eventually delivering a proper response to each one, but we know from experience that this is not really the case. If a news site gets 10,000 simultaneous hits and responds to each of them within 3 seconds, we might say the site scales adequately, if not exceptionally. But if the same site gets 100,000 simultaneous hits, responding to each one within three minutes would not be acceptable.[1]
A better definition of scalability is a system’s ability to grow in order to handle increased demand. Obviously, no single server can be expected to handle an infinite number of requests. In a scalable system, you have options when a single server has reached its maximum capacity. In general, you can:
Buy a faster server
Buy more servers
While it may seem obvious that a faster server can handle more requests, it is not always the case. Imagine a bank that stores its total assets ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access