Chapter 8. Automating OpenShift Cluster Operations
What does it mean to operate an OpenShift cluster?
Typically, operating software is divided into two distinct kinds of work:
- Firefighting work
-
Something stopped working so the pager of an operations team goes off and somebody starts looking into the problem immediately until the problem is solved.
- Execution of maintenance tasks that are often repetitive
-
This includes installing new software or software updates, updating configuration, refreshing certificates, or cleanup tasks.
This is not much different for OpenShift clusters. After setting up Cluster Monitoring as described in Chapter 7, your cluster should reach out to you, your operations, or the SRE team when it needs attention: Often the alerts issued by OpenShift will even give you hints on how to mitigate the problem that just occurred.
An OpenShift cluster itself should not need much attention if nothing goes wrong. That means repetitive tasks such as renewing the certificates that OpenShift needs to run should be mostly automated.
An exception to that is installing updates. Installing updates to the next minor or major version should be nondisruptive to the workload of your cluster, but installing the update is a maintenance task you need to care for.
That means most remaining maintenance tasks left specific to your deployment of the OpenShift cluster. How you want to handle cluster updates is up to you. What kind of SSL certificate you’re using for ...
Get Operating OpenShift now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.