Chapter 14. Scaling Up

In Chapter 13, I touched on the four golden signals of monitoring: saturation, latency, traffic, and errors. In this chapter, you can use those to help scale and secure applications.

As the amount of traffic increases, you would expect resources to become more saturated and latency to increase. If you are not careful, this can lead to errors and even downtime. You can use the golden signals to help scale your applications to meet the demands of your users.

So far, you have used a core of services from Google Cloud to build services. However, as the demands on your services increase, there are other options; in this chapter, you will be introduced to some of these and the circumstances in which employing them would make sense.

Note

The code for this chapter is in the scaling folder of the GitHub repository.

Skill Service with Memorystore

At the moment, the skill service retrieves Stack Overflow tags from a storage bucket and holds them in memory in the service. This is fine for what is a relatively small number of tags, but as the number of tags increases, the memory requirements will increase to where it is no longer practical to fit in the memory of a single Cloud Run instance. You can use a database to store the tags and retrieve them on demand. This will allow you to scale the service horizontally and add more instances as the number of tags increases, only retrieving the tags when they are needed.

This would be a good case for Cloud Memorystore. Cloud ...

Get Cloud Native Development with Google Cloud now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.