CHAPTER 13Monitoring Cloud Operations

Once you've containerized your applications and configured your orchestrators, clearly you need somewhere to host your workloads. Thanks to a decade and a half of cloud platform maturity, it is now possible to integrate your workloads tightly with dynamic cloud services. Keeping them running and limiting downtime requires effort, and much depends on how securely they are configured. In this chapter, however, we look at the Ops side of the cloud and how to navigate confidently through the challenges involved.

There are a number of reliable open source cloud monitoring tools that can greatly improve the visibility of your cloud estate and improve your cloud security posture management (CSPM) as a result.

Having a large screen in an office with dashboards offering useful and continuous, real-time updates about cloud infrastructure gained popularity as screens became cheaper, but being able to quickly interrogate varying cloud resources using specific criteria through a browser or API can mean that issues are spotted before they cause headaches. And do not forget that a welcome side effect of keeping a close eye on your cloud resources is also significant cost savings. For example, what if a race condition caused endless EC2 instances to be spawned unnecessarily? Or suppose repeated API calls push AWS account limits to their maximum, causing downtime after AWS directly rate-limited API calls in your account.

In this chapter, we will look at ...

Get Cloud Native Security now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.