CHAPTER 10

Cloud Operations

In this chapter, we’ll cover

•   Setting up your logging infrastructure in GCP

•   Creating metrics and alerts

•   Monitoring your applications for performance, uptime, and overall health

Reliability is the best metric for retaining customers. Knowing this, Google spun up Site Reliability Engineering (SRE), a philosophy similar to DevOps (and oftentimes referred to as a subset or sibling of DevOps), that focuses on leveraging aspects of software engineering and applying them to infrastructure and operations problems.

Even today in most traditional on-premises environments, operations management is typically handled by an IT operations team in charge of infrastructure provisioning, capacity management, cost control, ...

Get Google Cloud Certified Professional Cloud Architect All-in-One Exam Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.