Chapter 53. Resiliency and Scalability Are Key

Tidjani Belmansour

The rise of the cloud promises a virtually unlimited pool of resources. This opens up a whole range of new opportunities for everyone: the hobbyist, the freelancer, the startup, all the way to the biggest companies in the world.

Suddenly, the infrastructure running our applications can scale from one instance to thousands in a matter of minutes, if not seconds, in order to meet users’ demand. Scaling should be designed both ways: out (when there’s an increase in demand) and in (when demand decreases). This approach is also known as horizontal scaling.

Scaling out and scaling in refer to increasing and decreasing the number of instances in order to meet the demand in processing power so that users’ requests are not only fulfilled (i.e., not rejected because of the servers falling under pressure), but fulfilled within a reasonable amount of time (we refer to this as reducing the latency of our applications).

Another approach to scaling is known as vertical scaling. With this approach, we increase (scale up) or decrease (scale down) the computing power of our instances (more CPU, more RAM, etc.) rather than their count.

Ideally, we should aim for horizontal rather than vertical scaling, for at least these two reasons:

  • Vertical scaling doesn’t increase the number of instances: thus, ...

Get 97 Things Every Cloud Engineer Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.